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Preface to the Second Edition 


It is now 10 years since the first edition of this book appeared in 1980. The 
intervening decade has seen tremendous advances take place in mathe¬ 
matics generally, and in number theory in particular. It would seem desir¬ 
able to treat some of these advances, and with the addition of two new 
chapters, we are able to cover some portion of this new material. 

As examples of important new work that we have not included, we 
mention the following two results: 

(1) The first case of Fermat’s last theorem is true for infinitely many 
prime exponents p. This means that, for infinitely many primes p, the 
equation x p + y p = z p has no solutions in nonzero integers with p \ 
xyz- This was proved by L.M. Adelman and D.R. Heath-Brown and 
independently by E. Fouvry. An overview of the proof is given by 
Heath-Brown in the Mathematical Intelligencer (Vol. 7, No. 6, 1985). 

(2) Let pi ， p 2 , and p 3 be three distinct primes. Then at least one of them is 
a primitive root for infinitely many primes q. Recall that E. Artin 
conjectured that, if a E Z is not 0, 1, -1，or a square, then there are 
infinitely many primes q such that a is a primitive root modulo q. The 
theorem we have stated was proved in a weaker form by R. Gupta and 
M.R. Murty, and then strengthened by the combined efforts of R. 
Gupta, M.R. Murty, V.K. Murty, and D.R. Heath-Brown. An exposi¬ 
tion of this result, as well as an analogue on elliptic curves, is given by 
M.R. Murty in the Mathematical Intelligencer (Vol. 10, No. 4, 1988). 

The new material that we have added falls principally within the frame¬ 
work of arithmetic geometry. In Chapter 19 we give a complete proof of 
L.J. Mordell’s fundamental theorem, which asserts that the group of ra- 


v 



vi 


Preface to the Second Edition 


tional points on an elliptic curve, defined over the rational numbers, is 
finitely generated. In keeping with the spirit of the book, the proof (due in 
essence to A. Weil) is elementary. It makes no use of cohomology groups 
or any other advanced machinery. It does use finiteness of class number 
and a weak form of the Dirichlet unit theorem; both results are proved in 
the text. 

The second new chapter, Chapter 20, is an overview of G. Faltings’s 
proof of the Mordell conjecture and recent progress on the arithmetic of 
elliptic curves, especially the work of B. Gross, V.A. Kolyvagin, K. 
Rubin, and D. Zagier. Some of this work has surprising applications to 
other areas of number theory. We discuss one application to Fermat’s last 
theorem, due to G. Frey, J.P. Serre, and K. Ribet. Another important 
application is the solution of an old problem due to K.F. Gauss about 
class numbers of imaginary quadratic number fields. This comes about by 
combining the work of B. Gross and D. Zagier with a result of D. Gold- 
feld. This chapter contains few proofs. Its main purpose is to give an 
informative survey in the hope that the reader will be inspired to learn the 
background necessary to a better understanding and appreciation of these 
important new developments. 

The rest of the book is essentially, unchanged. An attempt has been 
made to correct errors and misprints. In an effort to keep confusion to a 
minimum, we have not changed the bibliography at the end of the book. 
New references for the two new chapters, Chapters 19 and 20, will be 
found at the end of those chapters. We would like to thank Toru Nakahara 
and others for submitting a list of misprints from the first edition. Also, we 
thank Linda Guthrie for typing portions of the final chapters. 

We have both been very pleased with the warm reception that the first 
edition of this book received. It is our hope that the new edition will 
continue to entice readers to delve deeper into the mysteries of this an¬ 
cient, beautiful, and still vital subject. 

February 1990 Kenneth Ireland 

Michael Rosen 


Addendum to Second Edition，Second Corrected Printing 

The.second printing of the second edition is unchanged except for correc¬ 
tions and the addition of a few clarifying comments. I would like to thank 
K. Conrad, M. Jastrzebski, F. Lemmermeyer and others who took the 
trouble to send us detailed lists of misprints. 


November 1992 


Michael Rosen 



Preface 


This book is a revised and greatly expanded version of our book Elements of 
Number Theory published in 1972. As with the first book the primary audience 
we envisage consists of upper level undergraduate mathematics majors and 
graduate students. We have assumed some familiarity with the material in a 
standard undergraduate course in abstract algebra. A large portion of 
Chapters 1-11 can be read even without such background with the aid of a 
small amount of supplementary reading. The later chapters assume some 
knowledge of Galois theory, and in Chapters 16 and 18 an acquaintance with 
the theory of complex variables is necessary. 

Number theory is an ancient subject and its content is vast. Any intro¬ 
ductory book must, of necessity, make a very limited selection from the 
fascinating array of possible topics. Our focus is on topics which point in the 
direction of algebraic number theory and arithmetic algebraic geometry. By a 
careful selection of subject matter we have found it possible to exposit some 
rather advanced material without requiring very much in the way of technical 
background. Most of this material is classical in the sense that is was dis¬ 
covered during the nineteenth century and earlier, but it is also modern 
because it is intimately related to important research going on at the present 
time. 

In Chapters 1-5 we discuss prime numbers, unique factorization, arith¬ 
metic functions, congruences, and the law of quadratic reciprocity. Very little 
is demanded in the way of background. Nevertheless it is remarkable how a 
modicum of group and ring theory introduces unexpected order into the 
subject. For example, many scattered results turn out to be parts of the answer 
to a natural question: What is the structure of the group of units in the ring 
Z/nZ? 
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Reciprocity laws constitute a major theme in the later chapters. The law 
of quadratic reciprocity, beautiful in itself, is the first of a series of reciprocity 
laws which lead ultimately to the Artin reciprocity law, one of the major 
achievements of algebraic number theory. We travel along the road beyond 
quadratic reciprocity by formulating and proving the laws of cubic and 
biquadratic reciprocity. In preparation for this many of the techniques of 
algebraic number theory are introduced; algebraic numbers and algebraic 
integers, finite fields, splitting of primes, etc. Another important tool in this 
investigation (and in others!) is the theory of Gauss and Jacobi sums. This 
material is covered in Chapters 6-9. Later in the book we formulate and prove 
the more advanced partial generalization of these results, the Eisenstein 
reciprocity law. 

A second major theme is that of diophantine equations, at first over finite 
fields and later over the rational numbers. The discussion of polynomial 
equations over finite fields is begun in Chapters 8 and 10 and culminates in 
Chapter 11 with an exposition of a portion of the paper “Number of solutions 
of equations over finite fields” by A. Weil. This paper, published in 1948, has 
been very influential in the recent development of both algebraic geometry 
and number theory. In Chapters 17 and 18 we consider diophantine equations 
over the rational numbers. Chapter 17 covers many standard topics from 
sums of squares to Fermat’s Last Theorem. However, because of material 
developed earlier we are able to treat a number of these topics from a novel 
point of view. Chapter 18 is about the arithmetic of elliptic curves. It dif¬ 
fers from the earlier chapters in that it is primarily an overview with many 
definitions and statements of results but few proofs. Nevertheless, by con¬ 
centrating on some important special cases we hope to convey to the reader 
something of the beauty of the accomplishments in this area where much work 
is being done and many mysteries remain. 

The third, and final, major theme is that of zeta functions. In Chapter 11 we 
discuss the congruence zeta function associated to varieties defined over finite 
fields. In Chapter 16 we discuss the Riemann zeta function and the Dirichlet 
L>functions. In Chapter 18 we discuss the zeta function associated to an 
algebraic curve defined over the rational numbers and Hecke L-functions. 
Zeta functions compress a large amount of arithmetic information into a 
single function and make possible the application of the powerful methods of 
analysis to number theory. 

Throughout the book we place considerable emphasis on the history of 
our subject. In the notes at the end of each chapter we give a brief historical 
sketch and provide references to the literature. The bibliography is extensive 
containing many items both classical and modern. Our aim has been to 
provide the reader with a wealth of material for further study. 

There are many exercises, some routine, some challenging. Some of the 
exercises supplement the text by providing a step by step guide through the 
proofs of important results. In the later chapters a number of exercises have 
been adapted from results which have appeared in the recent literature. We 
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hope that working through the exercises will be a source of enjoyment as well 
as instruction. 

In the writing of this book we have been helped immensely by the interest 
and assistance of many mathematical friends and acquaintances. We thank 
them all. In particular we would like to thank Henry Pohlmann who insisted 
we follow certain themes to their logical conclusion, David Goss for allowing 
us to incorporate some of his work into Chapter 16, and Oisin McGuiness 
for his invaluable assistance in the preparation of Chapter 18. We would 
like to thank Dale Cavanaugh, Janice Phillips, and especially Carol Ferreira, 
for their patience and expertise in typing large portions of the manuscript. 
Finally, the second author wishes to express his gratitude to the Vaughn 
Foundation Fund for financial support during his sabbatical year in 
Berkeley, California (1979/80). 

July 25,1981 Kenneth Ireland 

Michael Rosen 
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Chapter 1 


Unique Factorization 


The notion of prime number is fundamental in number 
theory. The first part of this chapter is devoted to proving 
that every integer can be written as a product of primes 
in an essentially unique way. 

After that，we shall prove an analogous theorem in the 
ring of polynomials over a field. 

On a more abstract plane，the general idea of unique 
factorization is treated for principal ideal domains. 

Finally, returning from the abstract to the concrete, the 
general theory is applied to two special rings that will be 
important later in the book. 


§1 Unique Factorization in Z 

As a first approximation, number theory may be defined as the study of the 
natural numbers 1, 2, 3, 4, .... L. Kronecker once remarked (speaking of 
mathematics generally) that God made the natural numbers and all the rest 
is the work of man. Although the natural numbers constitute, in some sense, 
the most elementary mathematical system, the study of their properties has 
provided generations of mathematicians with problems of unending fascina¬ 
tion. 

We say that a number a divides a number b if there is a number c such 
that b = ac. If a divides 6, we use the notation a\b. For example, 2|8, 3| 15, 
but 6 氺 21. If we are given a number, it is tempting to factor it again and 
again until further factorization is impossible. For example, 180 = 18 x 10 
= 2x9x2x5 = 2x3x3x2x5. Numbers that cannot be factored 
further are called primes. To be more precise, we say that a number ^ is a 
prime if its only divisors are 1 and p. Prime numbers are very important 
because every number can be written as a product of primes. Moreover, 
primes are of great interest because there are many problems about them 
that are easy to state but very hard to prove. Indeed many old problems 
about primes are unsolved to this day. 

The first prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 
43, .... One may ask if there are infinitely many prime numbers. The answer 
is yes. Euclid gave an elegant proof of this fact over 2000 years ago. We shall 
give his proof and several others in Chapter 2. One can ask other questions 
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of this nature. Let n(x) be the number of primes between 1 and x. What can 
be said about the function 7i(x)? Several mathematicians found by experiment 
that for large X the function n(x) was approximately equal to x/ln(x). This 
assertion, known as the prime number theorem, was proved toward the end 
of the nineteenth century by J. Hadamard and independently by Ch.-J. de la 
Valle Poussin. More precisely, they proved 


n(x) 

x/\n(x) 


Even from a small list of primes one can notice that they have a tendency 
to occur in pairs, for example, 3 and 5, 5 and 7， 11 and 13， 17 and 19. Do 
there exist infinitely many prime pairs? The answer is unknown. 

Another famous unsolved problem is known as the Goldbach conjecture 
(C. H. Goldbach). Can every even number be written as the sum of two 
primes? Goldbach came to this conjecture experimentally. Nowadays 
electronic computers make it possible to experiment with very large numbers. 
No counterexample to Goldbach’s conjecture has ever been found. Great 
progress toward a proof has been given by I. M. Vinogradov and L. Schnirel- 
mann. In 1937 Vinogradov was able to show that every sufficiently large odd 
number is the sum of three odd primes. 

In this book we shall not study in depth the distribution of prime numbers 
or “additive” problems about them (such as the Goldbach conjecture). 
Rather our concern will be about the way primes enter into the multiplicative 
structure of numbers. The main theorem along these lines goes back essen¬ 
tially to Euclid. It is the theorem of unique factorization. This theorem is 
sometimes referred to as the fundamental theorem of arithmetic. It deserves 
the title. In one way or another almost all the results we shall discuss depend 
on it. The theorem states that every number can be factored into a product of 
primes in a unique way. What uniqueness means will be explained below. 

As an illustration consider the number 180. We have seen that 180 = 
2x 2 x3x3x5 = 2 2 x 3 2 x 5. Uniqueness in this case means that 
tht only primes dividing 180 are 2, 3, and 5 and that the exponents 2, 2, and 
1 are uniquely determined by 180. 

Z will denote the ring of integers, i.e., the set 0, 士 1 ， ±2, ± 3, …， together 
with the usual definition of sum and product. It will be more convenient to 
work with Z rather than restricting ourselves to the positive integers. The 
notion of divisibility carries over with no difficulty to Z. If is a positive 
prime, —p will also be a prime. We shall not consider 1 or — 1 as primes even 
though they fit the definition. This is simply a useful convention. Note that 
1 and — 1 divide everything and that they are the only integers with this 
property. They are called the units of Z. Notice also that every nonzero 
integer divides zero. As is usual we shall exclude division by zero. 

There are a number of simple properties of division that we shall simply 
list. The reader may wish to supply the proofs. 
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(1) a\a, a ^ 0. 

(2) If a\b and b\a, then a = ±6. 

(3) a\b and b\c, then a\c. 

(4) \ia\b and a\c, then a\b c. 

Let neZ and let /7 be a prime. Then if n is not zero, there is a nonnegative 
integer a such that p a \n but p a+l )(n. This is easy to see if both p and n are 
positive for then the powers of p get larger and larger and eventually exceed n. 
The other cases are easily reduced to this one. The number a is called the 
order of « at /? and is denoted by ord p n. Roughly speaking ord p n is the 
number of times p divides n. If n = 0, we set ord p 0 = oo. Notice that 
ord p n = O if and only if (iff) p)(n. 

Lemma 1. Every nonzero integer can be written as a product of primes. 

Proof. Assume that there is an integer that cannot be written as a product of 
primes. Let N be the smallest positive integer with this property. Since N 
cannot itself be prime we must have N = mn ， where 1 < m, n < N. How¬ 
ever, since m and n are positive and smaller than N they must each be a 
product of primes. But then so is iV = mn. This is a contradiction. 

The proof can be given in a more positive way by using mathematical 
induction. It is enough to prove the result for all positive integers. 2 is a 
prime. Suppose that 2 < N and that we have proved the result for all 
numbers m such that 2 < m < N. We wish to show that TV is a product of 
primes. If AMs a prime, there is nothing to do. If N is not a prime, then 
N = mn ， where 2 < m, n < N. By induction both m and n are products of 
primes and thus so is N. □ 

By collecting terms we can write n = PVP 2 2 * * * where the p { are 
primes and the a { are nonnegative integers. We shall use the following 
notation: 

p 

where s(n) = 0 or 1 depending on whether n is positive or negative and 
where the product is over all positive primes. The exponents a(p) are non¬ 
negative integers and, of course, a{p) — 0 for all but finitely many primes. 
For example, if« = 180, we have = 0,a(2) = 2,a(i) = 2, anda(5) = 1, 
and all other a(p) = 0. 

We can now state the main theorem. 

Theorem 1. For every nonzero integer n there is a prime factorization 

« = 1 )_ p a <p\ 

p 

with the exponents uniquely determined by n. In fact, we have a(p) — ovd p n. 
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The proof of this theorem is not as easy as it may seem. We shall postpone 
the proof until we have established a few preliminary results. 

Lemma 2. If a, b el- and b > 0, there exist q, r eZ such that a = qb + r 
with 0 < r < b. 

Proof. Consider the set of all integers of the form a — xb with x e Z. This set 
includes positive elements. Let r = a — qbbe the least nonnegative element 
in this set. We claim that 0 < r < If not, r = a — qb > b and so 0 < a — 
< r, which contradicts the minimality of r. □ 

Definition. If a l5 a 2 , ..., e Z, we define (a x , a 2 , … ， a„) to be the set of 
all integers of the form a l x l + a 2 x 2 + •. • + a n x n with x l9 x 2 , ■.., x n e Z. 

Let A — (a x , a 2 , . . . , a n ). Notice that the sum and difference of two 
elements in A are again in A. Also, if a e A and r e Z, then ra e A. In ring- 
theoretic language, A is an ideal in the ring Z. 

Lemma 3. If a, b gZ, then there is a d eZ such that (a, b) = (d). 

Proof. We may assume that not both a and b are zero so that there are 
positive elements in (a, b). Let d be the smallest positive element in (a, b). 
Clearly (d) ^ (a, b). We shall show that the reverse inclusion also holds. 

Suppose that c e (a, b). By Lemma 2 there exist integers q and r such that 
c = qd + r with 0 < r < d. Since both c and d are in (a, b) it follows that 
r = c — qd is also in (a, b). Since 0 < r < we must have r = 0. Thus 
c = qd g (d). □ 

Definition. Let a, b e Z. An integer d is called a greatest common divisor of 
a and 6 if is a divisor of both a and b and if every other common divisor of 
a and b divides d. 

Notice that if c is another greatest common divisor of a and b, then we 
must have c|d and d\c and so c = 士 d. Thus the greatest common divisor of 
two numbers, if it exists, is determined up to sign. 

As an example, one may check that 14 is a greatest common divisor of 
42 and 196. The following lemma will establish the existence of the greatest 
common divisor, but it will not give a method for computing it. In the 
Exercises we shall outline an efficient method of computation known as the 
Euclidean algorithm. 

Lemma 4. Let a，b e I. If (a, b) = (d) then d is a greatest common divisor of 
a and b. 


Proof. Since a e (d) and b e (d) we see that J is a common divisor of a and b. 
Suppose that c is a common divisor. Then c divides every number of the form 
ax + by. In particular c\d. □ 
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Definition. We say that two integers a and b are relatively prime if the only 
common divisors are ±1， the units. 

It is fairly standard to use the notation (a, b) for the greatest common 
divisor of a and b. The way we have defined things, (a, b) is a set. However, 
since (a, b) = (d) and d is a greatest common divisor (if we require d to be 
positive, we may use the article the) it will not be too confusing to use the 
symbol (a, b) for both meanings. With this convention we can say that a and 
b are relatively prime if (a 9 b) = \. 

Proposition 1.1.1. Suppose that a\bc and that (a ， b) = l. Then a\c. 

Proof. Since (a, b) = 1 there exist integers r and s such that ra sb = 1. 
Therefore, rac + sbc = c. Since a divides the left-hand side of this equation 
we have a\c. □ 

This proposition is false if (a, Z>) # 1. For example, 6|24 but 6 氺 3 and 
6 氺 8. 

Corollary 1. If p is a prime andp\ be, then either p\b or p\c. 

Proof. The only divisors of p are ± 1 and +/?.Thus(/?, b) = 1 or/?; i.e.，either 
p\b or p and b are relatively prime. lip\b, we are done. If not, (p, b) = l and 
so, by the proposition, p\c. □ 

We can state the corollary in a slightly different form that is often useful : 
Ifis a prime and pjfb and p)(c, then be. 

Corollary 2. Suppose that p is a prime and that a,beZ. Then ovd p ab = ord p a 
+ ord p b. 

Proof. Let cc = ord p a and jS = ord p b. Then a = p a c and b = p^d, where 
p )( c and p )( d. Then ab = p a ”cd and by Corollary Ip cd. Thus ord p ab = 
oc + jS = ord p a -h ord^ b. □ 

We are now in a position to prove the main theorem. 

Apply the function ord q to both sides of the equation 

n = {- 1) E(,,) Y\p a{p) 

p 

and use the property of ord q given by Corollary 2. The result is 

ord^ n = s(n) ord q (- 1) + X a(p) ord q (p). 

p 

Now, from the definition of ovd q we have ord^ (— 1) = 0 and ovd q (p) = 0 
if/? # ^ and 1 ifp = q. Thus the right-hand side collapses to the single term 
a(q )， i.e., ord^ n = a(q )， which is what we wanted to prove. 
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It is to be emphasized that the key step in the proof is Corollary 1 : namely, 
if p\ab, then p\a or p\b. Whatever difficulty there is in the proof is centered 
about this fact. 


§2 Unique Factorization in k\_x] 

The theorem of unique factorization can be formulated and proved in more 
general contexts than that of Section 1. In this section we shall consider the 
ring /c[x] of polynomials with coefficients in a field fc. In Section 3 we shall 
consider principal ideal domains. It will turn out that the analysis of these 
situations will prove useful in the study of the integers. 

If f, g e k^x], we say that / divides g if there is an A e k\x] such that 

9 — /^* 

If deg / denotes the degree of /， we have deg fg = deg / + deg g. Also, 
remember that deg / = 0 iff / is a nonzero constant. It follows that f\g and 
"l/iff/= eg, where c is a nonzero constant. It also follows that the only 
polynomials that divide all the others are the nonzero constants. These are 
the units of A ： [x], A nonconstant polynomial p is said to be irreducible if 
q\p implies that q is either a constant or a constant times p. Irreducible 
polynomials are the analog of prime numbers. 

Lemma 1. Every nonconstant polynomial is the product of irreducible poly¬ 
nomials. 

Proof. The proof is by induction on the degree. It is easy to see that poly¬ 
nomials of degree 1 are irreducible. Assume that we have proved the result 
for all polynomials of degree less than 行 and that deg f — nAf /is irreducible, 
we are done. Otherwise / = gh, where 1 < deg g, deg A < «. By the induc¬ 
tion assumption both g and h are products of irreducible polynomials. Thus 
so is / = gh. □ 

It is convenient to define monic polynomial. A polynomial /is called monic 
if its leading coefficient is 1. For example, x 2 + x — 3 and x 3 — + 3x + 

17 are monic but 2x 3 — 5 and 3x 4 -h 2x 2 — 1 are not. Every polynomial 
(except zero) is a constant times a monic polynomial. 

Let be a monic irreducible polynomial. We define ord p / to be the 
integer a defined by the property that p a \f but that p a+1 Jff- Such an integer 
must exist since the degree of the powers of p gets larger and larger. Notice 
that ord p / = 0 iff p)(f. 

Theorem 2. Let f ek\x~\. Then we can write 

f=ci\p aip \ 

p 
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where the product is over all monic irreducible polynomials and c is a constant. 
The constant c and the exponents a(p) are uniquely determined by f ; in fact, 
a{p) = ord p /. 

The existence of such a product follows immediately from Lemma 1. As 
before, the uniqueness is more difficult and the proof will be postponed until 
we develop a few tools. 

Lemma 2. Let f, g e k^x]. If g 0, there exist polynomials h, r e k^x] such 

that f = hg r, where either r == 0 or r # 0 and deg r < deg 

Proof. If g\f, simply set /i = //g and r = 0. If g )(/, let r = / — hg be the 
polynomial of least degree among all polynomials of the form f — Ig with 
/ e k[pc]. We claim that deg r < deg f If not, let the leading term of r be 
ax d and that of g be bx m . Then r — ab~ 1 x d ~ m g = f — (h + ab— 1 x d ~ m )g has 
smaller degree than r and is of the given form. This is a contradiction. □ 

Definition. If /” / 2 , e /c[x], then (f l9 / 2 , is the set of all 

polynomials of the form f 1 h i + f 2 h 2 + ... + f n h n ， where h u h 2 , … ， h n 
e /c[x]. 

In ring-theoretic language (/i, A, •••，/") is the ideal generated by 

fu fi” • . 5 fn- 

Lemma 3. Given f，g e k[pc] there is ad e k[pc] such that (f, g) = (d). 

Proof. In the set (/, g) let c/be an element of least degree. We have (d) ^ (f, g) 
and we want to prove the reverse inclusion. Let c e (f ， g). If djfc, then there 
exist polynomials h and r such that c = hd + r with deg r < deg d. Since 
c and d are in (/, g) we have r = c — hd (f, g). Since r has smaller degree 
than d this is a contradiction. Therefore, d\c and c e (d). □ 

Definition. Let /, g e k[x]. Then d eA:[x] is said to be a greatest common 
divisor of / and g if d divides / and g and every common divisor of / and g 
divides d. 

Notice that the greatest common divisor of two polynomials is determined 
up to multiplication by a constant. If we require it to be monic, it is uniquely 
determined and we may speak of the greatest common divisor. 

Lemma 4. Let f, g e k\x~]. By Lemma 3 there is ad e k[pc] such that (f, g) — 
(d). d is a greatest common divisor off and g. 

Proof. Since / e (d) and g e (d) we have d\f and d\g. Suppose that h \f and 
that I 分 . Then A divides every polynomial of the form/7 + gm with l,me A:[x]. 
In particular h | d, and we are done. □ 
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Definition. Two polynomials / and g are said to be relatively prime if the only 
common divisors of / and g are constants. In other words, (/, g) = (1). 

Proposition 1.2.1. If f and g are relatively prime and f | gh, then f\h. 

Proof. If / and g are relatively prime, we have (/, g) = (1) so there are poly¬ 
nomials / and m such that If + mg = 1. Thus Ifh -{- mgh — h. Since / 
divides the left-hand side of this equation / must divide h. □ 

Corollary 1. If p is an irreducible polynomial and p\fg, then p\f or p\g. 

Proof. Since p is irreducible (/?,/) = (p) or (1). In the first case p\f and we 
are done. In the second case p and / are relatively prime and the result 
follows from the proposition. □ 

Corollary If p is a monic irreducible polynomial and f,ge k[x\, we have 
OTd p fg = ord p /+ ord p g. 

Proof. The proof is almost word for word the same as the proof to Corollary 
2 to Proposition 1.1.1. □ 

The proof of Theorem 2 is now easy. Apply the function ord q to both sides 
of the relation 

/= c]\p a(p) . 

p 

We find that 

ord q f = ovd q c + [ a(p) ord q p. 

p 

Now, since c is a constant qjfc and ord q c = 0. Moreover, ord q p = 0 if 
q ^ p and \ q = p. Thus the above relation yields ord q f = a{q). This 
shows that the exponents are uniquely determined. It is clear that if the 
exponents are uniquely determined by /, then so is c. This completes the 
proof. □ 


§3 Unique Factorization in a Principal Ideal Domain 

The reader will not have failed to notice the great similarity in the methods 
of proof in Sections 1 and 2. In this section we shall prove an abstract theorem 
that includes the previous results as special cases. 

Throughout this section R will denote an integral domain. 

Definition 1 •沢 is said to be a Euclidean domain if there is a function X from the 
nonzero elements of R to the set {0, 1, 2, 3,...} such that if a 9 be R,b ^ 0, 



§3 Unique Factorization in a Principal Ideal Domain 


9 


there exists c，d e R with the property a = cb + d and either J = 0 or 
X{d) < k{b). 

The rings Z and A:[x] are both Euclidean domains. In Z we can take 
ordinary absolute value as the function X ; in the ring A:[x] the function that 
assigns to every polynomial its degree will serve the purpose. 

Proposition 1.3.1. If R is a Euclidean domain and I ^ R is an ideal, then there 
is an element a e R such that I = Ra = {ra\r e R}. 

Proof. Consider the set of nonnegative integers {^{b) | b e I, b ^ 0). Since 
every set of nonnegative integers has a least element there is an a e /, a 參 0, 
such that X{a) < 1(b) for all Z? e /, Z? # 0. We claim that / = Ra. Clearly, 
Ra ^ I. Suppose that b e I; then we know that there are elements c, d eR 
such that b = ca + d, where either = 0 or X{d) < 1(a). Since d = b — 
ca g /we cannot have 1(d) < 1(a). Thus d = 0 and b = ca e Ra. Therefore, 
I ^ Ra and we are done. □ 

For elements 〜 ，…， e 沢， define (a l9 a 2i . . . , a n ) = Ra x -h Ra 2 -h. 
• • • + Ra n = {Yj=i e R}. (a l9 is an ideal. If an ideal / 

is equal to (a l9 . . . , a n ) for some elements a { e I ， we say that I is finitely 
generated. If / = (a) for some a e /, we say that / is a principal ideal. 

Definition 2. R is said to be a principal ideal domain (PID) if every ideal of R is 
principal. 

Proposition 1.3.1 asserts that every Euclidean domain is a PID. The con¬ 
verse of this statement is false, although it is somewhat hard to provide 
examples. 

The remaining discussion in this section is about PID’s. The notion of 
Euclidean domain is useful because in practice one can show that many 
rings are PID’s by first establishing that they are Euclidean domains. We 
shall give two further examples in Section 4. 

We introduce some more terminology. If a, b e R, b ^ 0, we say that b 
divides a if a = be for some c e R. Notation : b\a. An element u e R is 
called a unit if u divides 1. Two elements a，b e R are said to be associates if 
a = bu for some unit u. An element p e R is said to be irreducible if a\p 
implies that a is either a unit or an associate of/?. A nonunit p e R is said to be 
prime if/? ^ 0 and p | ab implies that p\aoxp\b. 

The distinction between irreducible element and prime element is new. 
In general these notions do not coincide. As we have seen they do coincide 
in Z and A:[x], and we shall prove shortly that they coincide in a PID. 

Some of the notions we are discussing can be translated into the language 
of ideals. Thus a\b {b) (a), w is a unit iff (u) = R. a and b are 

associate iff {a) = (b). p is prime iff ab e (p) implies that either a e (p) or 
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b e(p). All these assertions are easy exercises. The notion of irreducible 
element can be formulated in terms of ideals, but we will not need it. 

Definition, d e Ris said to be a greatest common divisor (gcd) of two elements 
a, b e Ri( 

(a) d\a and d\b. 

(b) d f \a and d f | b implies that d f \d. 

It is easy to see that if both J and W are gcd’s of a and b, then d is associate 
to d\ 

The gcd of two elements need not exist in a general ring. However, 

Proposition 1.3.2. Let R be a PID and a, b e R. Then a and b have a greatest 
common divisor d and (a, b) - = ⑷. 

Proof. Form the ideal (a, b). Since Riss. PID there is an element d such that 
(a ， b) = (d). Since (a) ^ (d) and (b) ^ (d) we have d\a and d\b. If d'\a 
and d f \b, then (a) ^ (d') and (b) ^ (d’). Thus (d) = (a, b) ^ {d') and d f \d. 
We have proved that J is a gcd of a and b and that («, b) = (d). □ 

Two elements a and b are said to be relatively prime if the only common 
divisors are units. 

Corollary 1. If R is a PID and a, b e R are relatively prime, then (a, b) = R. 

Corollary 2. If R is a PID and p e Ris irreducible，then p is prime. 

Proof. Suppose that p\ab and that pjfa. Since p)(a it follows that the only 
common divisors are units. By Corollary 1 (a, p) = R. Thus (ab ， pb) = (b). 
Since ab g (p) and pb g (p) we have (b) ^ (/?). Thus p | b. 

It is easy to see that a prime is irreducible. [ 

From now on R will be a PID and we shall use the words prime and 
irreducible interchangeably. 

We want to show that every nonzero element of 穴 is a product of irredu¬ 
cible elements. The proof is in two steps. First one shows that if a e R, 
a ^ 0, there is an irreducible dividing a. Then we show that 0 is a product of 
irreducibles. 

Lemma 1. Let (a x ) ^ (a 2 ) ^ (a 3 ) E . . - be an ascending chain of ideals. Then 
there is an integer k such that (a k ) = (a k + 1 ) for l = 0, 1,2,.... In other words, 
the chain breaks off in finitely many steps. 

Proof. Let / = It is easy to see that / is an ideal. Thus / = (a) for 

some ae R. But a g implies that a e (a k ) for some k, which shows 

that I = (a) g (a k ). It follows that / = (a k ) = (a k+1 ) = • • • • □ 
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Proposition 1.3.3. Every nonzero nonunit of R is a product of irreducible s. 

Proof. Let a e R, a # 0, a not a unit. We wish to show, to begin with, that a 
is divisible by an irreducible element. If a is irreducible, we are done. Other¬ 
wises = a 1 b 1 , where a! and are nonunits. Ua 1 is irreducible, we are done. 
Otherwise a x = a 2 b 2 , where a 2 and b 2 are nonunits. If a 2 is irreducible, we 
are done. Otherwise continue as before. Notice that (a) (a^ a (a 2 ) c .... 

By Lemma 1 this chain cannot go on indefinitely. Thus for some k, a k is 
irreducible. 

We now show that a is a product of irreducibles. If a is irreducible, we are 
done. Otherwise let pj be an irreducible such that p x \a. Then a = p 1 c l . If 
c 1 is a unit, we are done. Otherwise let p 2 be an irreducible such that p 2 \c 1 . 
Then a = p l p 2 c 2 . If c 2 is a unit, we are done. Otherwise continue as before. 
Notice that (a) (c x ) c= (c 2 ) c . . . . This chain cannot go on indefinitely 
by Lemma 1. Thus for some k ， a = p l p 2 - - .p k c k ， where c k is a unit. Since 
p k c k is irreducible, we are done. □ 

We now want to define an ord function as we have done in Sections 1 
and 2. 

Lemma 2. Let p be a prime and a ^ 0. Then there is an integer n such thatp n \a 
but p n+l J(a. 

Proof. If the lemma were false, then for each integer m > 0 there would be 
an element b m such that a = p m b m . Then pb m+1 = b m so that (b^ cz ( 厶 2 ) 〔 
c • • • would be an infinite ascending chain of ideals that does not 
break off. This contradicts Lemma 1. □ 

The integer n, which is defined in Lemma 2, is uniquely determined by 
p and a. We set n = ord p a. 

Lemma 3. If a, b e R with a, b ^ 0, then ord p ab = ord p a -h ord p b. 

Proof. Let a = ord p a and ^ = ord p b. Then a = p a c and b = p p d with 
p)(c and p)(d. Thus ab = p a ”cd. Since p is prime p)(cd. Consequently, 
ord p ab = a P = ord p a -h ord p b. □ 

We are now in a position to formulate and prove the main theorem of this 
section. 

Let *S be a set of primes in R with the following two properties : 

(a) Every prime in R is associate to a prime in S. 

(b) No two primes in S are associate. 

To obtain such a set choose one prime out of each class of associate 
primes. There is clearly a great deal of arbitrariness in this choice. In Z 
and 众 [x] there were natural ways to make the choice. In Z we chose S to be 
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the set of positive primes. In k\^x] we chose S to be the set of monic irreducible 
polynomials. In general there is no neat way to make the choice and this 
occasionally leads to complications (see Chapter 9). 

Theorem 3. Let Rhea PID and S a set ofprimes with the properties given above. 
Then if a e R, a ^ 0, we can write 

a = uY[P eiP \ (1) 

p 

where u is a unit and the product is over all p e S. The unit u and the exponents 
e(p) are uniquely determined by a. In fact, e(p) = ord p a. 

Proof. The existence of such a decomposition follows immediately from 
Proposition 1.3.3. 

To prove the uniqueness, let ^ be a prime in S and apply ord 4 to both 
sides of Equation (1). Using Lemma 3 we get 

ovd q a = ovd q w + [ e(p) ord q p. 

p 

Now, from the definition of ord 4 we see that ord^ u = 0 and that oxd q p = 
0 if ^ and \ q = p. Thus ord 4 a = e(q). Since the exponents e(q) are 
uniquely determined so is the unit u. This completes the proof. □ 


§4 The Rings Z[/] and Z[co] 

As an application of the results in Section 3 we shall consider two examples 
that will be useful to us in later chapters. 

Let i = ^/― 1 and consider the set of complex numbers Z[i] defined 
by {a -h bi\a, b e Z}. This set is clearly closed under addition and subtrac¬ 
tion. Moreover, if « -h bi, c + di e Z[z"], then (a -h bi){c -f di) = ac -\- 
adi + bci 4 - bdi 2 = {ac — bd) + (ad -h bc)i eZ[z]. Thus Z[/] is closed 
under multiplication and is a ring. Since Z[/] is contained in the complex 
numbers it is an integral domain. 

Proposition 1.4.1. Z[/"] is a Euclidean domain. 

Proof. For a -h bi e Q[z] define X(a -h bi) = a 2 b 2 . 

Let (X = a + bi and y = c di and suppose that y ^ 0. a/y = r + si, 
where r and s are real numbers (they are, in fact, rational). Choose integers 
m,neZ such that \r — m\ < ^ and |^ — «| < ^. Set 5 = m -f ni. Then 
6 g Z[z] and 乂 ((a/y) — 8) = (r — m) 2 {s — n) 2 < ^ ^ Set p = 

a — yd. Then p e Z[/] and either p = 0 or ).{p) = A(y((a/y) — <5))= 
- S) < < A(y). 

It follows that X makes l\i\ into a Euclidean domain. 


□ 
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The ring J\i\ is called the ring of Gaussian integers after C. F. Gauss, 
who first studied its arithmetic properties in detail. 

The numbers 土 1 ， ±i are the roots of x 4 = 1 over the complex numbers. 
Consider the equation x 3 = 1. Since x 3 — 1 = (x — l)(x 2 -f x -t- 1) 

the roots of this equation are 1 ， （一 1 土 ^/—3)/2. Let co =(—1 + y z 3)/2. 

Then it is easy to check that co 2 = (—1 — ^—3)/2 and that 1 + co + 

= 0 . 


(D 
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Consider the set Z[co] = {a bco\ a, b g Z}. Z[co] is closed under 
addition and subtraction. Moreover，(a + bco)(c 4 - dco) — ac {ad -f bc)a> 
+ bdco 2 = (ac — bd) -h (ad be — bd)co. Thus Z[co] is a ring. Again, 
since Z[co] is a subset of the complex numbers it is an integral domain. 

We remark that Z[co] is closed under complex conjugation. In fact, since 

■>/ —3 = = — ^/3/ = — ^/― 3 we see that co = co 2 . Thus if a = 

a bco e Z[co], then oi = a-\-bo5 = a-\- bco 2 = (a — b) — bco e Z[co]. 


Proposition 1.4.2. Z[co] is a Euclidean domain. 

Proof. For a = a ^ bo e Z[co] define 义 (a) = a 2 — ab + b 2 . A simple 
calculation shows that A(a) = aa. 

Now, let a, P e Z[co] and suppose that ^ 0. Then a//? = — 

r + so, where r and s are rational numbers. We have used the fact that 
= A(j8) is a positive integer and that (xp e Z[co] since a and P e Z[co]. 
Find integers m and n such that \r — m\ < j and |^ — «| < Then 
put y = m nco. 2((a/j8) — y) = (r — m) 2 — (r — m)(s — n) (s — n) 2 

+ i + 1. 

Let p = (x — yji. Then either p = 0 or X(p) = A(j8((a/^9) — y))= 
柳 _) - y) < □ 

From the analysis of Section 3 we know that the theorem of unique 
factorization is true in both l\i\ and Z[co]. To go further with the analysis 
of these rings we would have to investigate the units and the prime elements. 
There are some results of this nature in the exercises. 


Notes 

Rings for which the theorem of unique factorization into irreducibles holds 
are called unique factorization domains (UFD). The fact that Z is a UFD 
is already implicit in Euclid, but the first explicit and clear statement of the 
result seems to be in C. F. Gauss’ masterpiece Disquisitiones Arithmeticae 
(available in English translation by A. A. Clark, Yale University Press, 
New Haven, Conn., 1966). Zermelo gave a clever proof by contradiction, 
which is reproduced in the excellent book of G. H. Hardy and Wright 
[40]. See also Davis and Shisha [120]. 

We have shown that every PID is a UFD. The converse is not true. For 
example, the ring of polynomials over a field in more than one variable is a 
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UFD but not a PID. P. Samuel has an excellent expository article on UFD’s 
in [67]. A more elementary introduction may be found in the book of H. 
Rademacher and O. Toeplitz [65]. 

The reader may find it profitable to read the introductory material in 
several books on number theory. Chapter 3 of A. Frankel [32] and the 
introduction to H. Stark [73] are particularly good. There is also an early 
lecture by Hardy [39] that is highly recommended. 

The ring Z\_f\ was introduced by Gauss in his second memoir on biquad¬ 
ratic reciprocity [34]. G. Eisenstein considered the ring Z[co] in connection 
with his work on cubic reciprocity. He mentions that to investigate the 
properties of this ring one need only consult Gauss’ work on Z[/] and 
modify the proofs [28]. A thorough treatment of these two rings is given in 
Chapter 12 of Hardy and Wright [40]. In Chapter 14 they treat a generaliza¬ 
tion, namely, rings of integers in quadratic number fields. Stark’s Chapter 8 
deals with the same subject [73]. In 1966 Stark resolved a long-outstanding 
problem in the theory of numbers by showing that the ring of integers (see 

Chapter 6 of this book) in the field 0(^5), with d negative, is a UFD when 
J = — 1, —2, —3, —7, —11, — 19, 一 43, 一 67, and — 163 and for no other 
values of d. 

The student who is familiar with a little algebra will notice that a “ generic ” 
non-UFD is given by the ring /c[x, y 9 z, w], with xy = zw, where fc is a 
field. Another example of a non-UFD is C[x, y, z], with x 2 + y 2 + 
z 2 -1, where C is the field of complex numbers. To see this notice that 
(x + iy){x - iy) = (1 - z)(l -h z). 


Exercises 


1. Let a and b be nonzero integers. We can find nonzero integers q and r such that 
a = qb + r, where 0 < r < b. Prove that (a, b) = (b, r). 

2. (continuation) If r # 0, we can find q x and r 1 such that b = q^r -f r 1 with 0 < 
r x < r. Show that (a, b) = (r, rj. This process can be repeated. Show that it must end 
in finitely many steps. Show that the last nonzero remainder must equal (a, b). The 
process looks like 


a = 

qb -1- r, 

0 < r < b, 

b = 

+ r u 

0 < r i < r, 

r — 

(hri + r 2 , 

0 < r 2 < r u 

-1 = 

l r k + r k+ lJ 

0 <r k+1 < i 

r k = 

Qk+2 r k+l' 



Then r k + 1 = (a, b). This process of finding (a, b) is known as the Euclidean algorithm. 
3. Calculate (187, 221), (6188, 4709), and (314,159). 
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4. Let d = (a, b). Show how one can use the Euclidean algorithm to find numbers m 
and n such that am + bn = d. (Hint: In Exercise 2 we have that d = r k+ Express 
r k+l in terms of r k and r k _ x then in terms of r k _ x and r k _ 2 , etc.) 

5. Find m and n for the pairs a and b given in Exercise 3. 

6. Let a, b, ceZ. Show that the equation ax + by = c has solutions in integers iff 
(a, b) I c. 

7. Let d = (a, b) and a = da' and b = db\ Show that (a\ b f ) = 1. 

8. Let x 0 and be a solution to ax + by = c. Show that all solutions have the form 
x = x 0 + t(b/d), y = y 0 — t(a/d), where d = (a, b) and teZ. 

9. Suppose that u,veZ and that (w, v) = i.Uu\n and v \ n, show that uv \ n. Show that this 
is false if (w, z;) # 1. 

10. Suppose that (u, v) = 1. Show that (u + v, u — v) is either 1 or 2. 

11. Show that (a, a + k)\k. 


12. Suppose that we take several copies of a regular polygon and try to fit them evenly 
about a common vertex. Prove that the only possibilities are six equilateral triangles, 
four squares, and three hexagons. 

13. Let n l5 n 2 ,..., n s eZ. Define the greatest common divisor d of n u n 2 , ...,n s and 
prove that there exist integers m 1? m 2 , •.. ， m s such that n l m l + n 2 m 2 + •.. + 
n s m s = d. 

14. Discuss the solvability of a l x l + a 2 x 2 + • • • + a r x r = c in integers. (Hint: Use 
Exercise 13 to extend the reasoning behind Exercise 6.) 

15. Prove that aeZ is the square of another integer iff ord p a is even for all primes p. 
Give a generalization. 

16. If(w, v) = \ and uv = a 2 , show that both u and v are squares. 

17. Prove that the square root of 2 is irrational, i.e., that there is no rational number 
r = a/b such that r 2 = 2. 

18. Prove that ^[m is irrational if m is not the nth power of an integer. 

19. Define the least common multiple of two integers a and b to be an integer m such that 
a\m, b\m, and m divides every common multiple of a and b. Show that such an m 
exists. It is determined up to sign. We shall denote it by [a, b~\. 

20. Prove the following: 

(a) ord p [a, 6] = max(ord p a, ord p b). 

(b) (a, b) [a, b ]= :ab. 

(c) (a + b, la, b-]) = (a, b). 

21. Prove that ord p (a + b) > min(ord p a, ord p b) with equality holding if ord p a # 
ord p b. 

22. Almost all the previous exercises remain valid if instead of the ring Z we consider 
the ring k[x]. Indeed, in most we can consider any Euclidean domain. Convince 
yourself of this fact. For simplicity we shall continue to work in Z. 
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23. Suppose that a 2 b 2 = c 2 with a, b,ceZ. For example, 3 2 + 4 2 = 5 2 and 5 2 + 
12 2 = 13 2 . Assume that (a, b) = (b, c) = (c, a) = 1. Prove that there exist integers u 
and v such that c — b = 2u 2 and c b = 2v 2 and (w, v) = 1 (there is no loss in 
generality in assuming that t and c are odd and that a is even). Consequently a = 2uv, 
b = v 2 — w 2 , and c = v 2 + u 2 . Conversely show that if u and v are given，then the 
three numbers a, b, and c given by these formulas satisfy a 2 + b 2 = c 2 . 

24. Prove the identities 

(a) x n — y n = (x — y)(x” _1 + x n ~ 2 y + ... + y n ~ x ). 

(b) For n odd, x n y n = (x y)(x n_1 — x n ~ 2 y + x n ~ 3 y 2 — • • • + y n ~ x ). 

25. If a n — 1 is a prime, show that a = 2 and that m is a prime. Primes of the form 2 P — 1 
are called Mersenne primes. For example, 2 3 — 1 = 7 and 2 5 — 1 = 31. It is not 
known if there are infinitely many Mersenne primes. 

26. If a" + 1 is a prime, show that a is even and that m is a power of 2. Primes of the 
form 2 2t + 1 are called Fermat primes. For example, 2 21 + 1 = 5 and 2 22 + 1 = 17. 
It is not known if there are infinitely many Fermat primes. 

27. For all odd n show that 8|m 2 — 1. If 3 Jfn, show that 6\n 2 — 1. 

28. For all n show that 30|n 5 — n and that 42 |m 7 — n. 

29. Suppose that a ， b ， c，d ^ Z and that (a, b) = (c, d) = 1. lf(a/b) + (c/d) = an integer, 
show that = 土 d. 

30. Prove thati + 士 + ... + ^ is not an integer. 

31. Show that 2 is divisible by (1 + i) 2 in Z[Q. 

32. For a = a + bi e / [i] we defined A(a) = a 2 + b 2 . From the properties of A deduce the 
identity (a 2 -f b 2 )(c 2 -f d 2 ) = (ac — bd) 2 -h (ad 4- be) 2 . 

33. Show that a e Z[i] is a unit iff 又 (a) = 1. Deduce that 1, — 1, i, and —i are the only 
units in Z[i]. 

34. Show that 3 is divisible by (1 — co) 2 in Z[co]. 

35. For a = a -h bcoE I\co] we defined 又 (a) = a 2 — ab + b 2 . Show that a is a unit iff 
又 (a) = 1. Deduce that 1, — 1, co, —co, co 2 , and —co 2 are the only units in / [co], 

36. Define - 2] as the set of all complex numbers of the form a + bj—2, where 
a, b e Z, Show that —2] a ring. Define A(a) = a 2 + 2b 2 for a = a + by/— 2. 
Use X to show that Zf^/ —2] is a Euclidean domain. 

37. Show that the only units in Z[^/ —2] are 1 and — 1. 

38. Suppose that n e Z\i] and that 又 (7i) = p is a prime in /. Show that 7i is a prime in 
Z[Q. Show that the corresponding result holds in Z[co] and Z[y/^2]. 

39. Show that in any integral domain a prime element is irreducible. 



Chapter 2 


Applications of Unique 
Factorization 


The importance of the notion of prime number should be 
evident from the results of Chapter 1. 

In this chapter we shall give several proofs of the fact 
that there are infinitely many primes in Z. We shall also 
consider the analogous question for the ring 

The theorem of unique prime decomposition is some¬ 
times referred to as the fundamental theorem of arith¬ 
metic. We shall begin to demonstrate its usefulness by 
using it to investigate the properties of some natural 
number-theoretic functions. 


§1 Infinitely Many Primes in Z 


Theorem 1 (Euclid). In the ring Z there are infinitely many prime numbers. 

Proof. Let us consider positive primes. Label them in increasing order 

Pi ， P 2 , P 3 , • • •. Thus Pi = 2, p 2 = 3, p 3 = 5, etc. Let N = (p x p 2 • • • p”）+ 1. 
N is greater than 1 and not divisible by any p h i = 1 ， 2, ... ， n. On the other 
hand, N is divisible by some prime, p, and p must be greater than p n . 

We have shown that given any positive prime there is another prime that 
is greater. It follows that the set of primes is infinite. □ 

The analogous theorem for /c[x] is that there are infinitely many monic ， 
irreducible polynomials. If k is infinite, this is trivial since x — ais monic and 
irreducible for all aek. This proof does not work if k is finite, but Euclid’s 
proof may easily be adapted to this case. We leave this as an exercise. 

Recall that in an integral domain two elements are called associate if they 
differ only by multiplication by a unit. We now know that in Z and fc[x] there 
are infinitely many nonassociate primes. It is instructive to consider a ring 
where all primes are associate, so that in essence there is only one prime. 

Let p e Z be a prime number and let Z p be the set of all rational numbers 
a/b, where p Jf" b. One easily checks using the remark following Corollary 1 to 
Proposition 1.1.1 that Z p is a ring, a/b eZ p is a unit if there is a c/d e Z p 
such that a/b - c/d = 1. Then ac = bd, which implies pjf a since pj^b and 
p )( d. Conversely, any rational number a/b is a unit in Z p if p 氺 a and p Jf" b. 
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If a/b g Z p ，write a = p l a\ where p )( a\ Then a/b = p l a’/b. Thus every element 
of Z p is a power of p times a unit. From this it is easy to see that the only 
primes in Z p have the form pc/d, where c/d is a unit. Thus all the primes of 
Z p are associate. 


Exercise 

If a/b e Z p is not a unit, prove that a/b + 1 is a unit. This phenomenon shows why Euclid’s 
proof breaks down in general for integral domains. 


§2 Some Arithmetic Functions 

In the remainder of this chapter we shall give some applications of the unique 
factorization theorem. 

An integer aeZis said to be square-free if it is not divisible by the square 
of any other integer greater than 1. 

Proposition 2.2.1. If n eZ,n can be written in the form n — ab 2 , where a,beZ 
and a is square-free. 

Proof. Let n = p a iP a 2 2 - - - One can write a { — 2b t + r h where r t = 0 or 1 
depending on whether is even or odd. Set a = v r \V r i ''' P? and b = 
p b ?p b 2 2 ... p\ l . Then n — ab 2 and a is clearly square-free. □ 

This lemma can be used to give another proof that there are infinitely 
many primes in Z. Assume that there are not, and let /?! ， p 2 ， . • •，仍 be a com¬ 
plete list of positive primes. Consider the set of positive integers less than or 
equal to N.lf n < N, then n = ab 2 , where a is square-free and thus equal to 
one of the 2 l numbers p\ l p E 2 - - - Pi 1 , where = 0 or 1， i = 1 ， … ， Z. Notice 

that b < y/N. There are at most 2 / v /iV numbers satisfying these conditions 

and so N < 2 l y/N, or ^/iV < which is clearly false for N large enough. 
This contradiction proves the result. 

It is possible to give a similar proof that there are infinitely many monic 
irreducibles in fe[x], where k is a finite field. 

There are a number of naturally defined functions on the integers. For 
example, given a positive integer n let v(n) be the number of positive divisors 
of n and a(n) the sum of the positive divisors of n. For example, v(3) = 2, 
v(6) = 4， and v(12) = 6 and a(3) = 4 ， a(6) =12， and cr(12) = 28. Using 
unique factorization it is possible to obtain rather simple formulas for these 
functions. 



§2 Some Arithmetic Functions 


19 


Proposition 2.2.2. If n is a positive integer, let n = p^Pi 2 ''' PV be its prime 
decomposition. Then 

(a) v(n) = (a x l)(a 2 + 1) • • • (^ + 1). 

(b) a(n) = ((pV + 1 — l)/(Pi — 1))((P2 2 + 1 — 1)/(P2 — 1))... 

((pv +1 - mpi - 1)). 

Proof. To prove part (a) notice that m|n iffm = p b ip b 2 2 • • • and 0 < b ( < a t 
for i = 1 ， 2, …， /• Thus the positive divisors of n are one-to-one correspon¬ 
dence with the n-tuples (b 1 , b 2 , …， b!) with 0 < b t < a t for i = 1， •. • ， /， and 
there are exactly (a x + l)(a 2 + 1) • • • (^ + 1) such n-tuples. 

To prove part (b) notice that a(n) = p b iP b 2 2 - - - p b i \where the sum is over 
the above set of n-tuples. Thus, a(n)= ( 公 ;= 。 p b i)(Zb 2 2 = o Pi 2 ) •- (Zb\=o P b i l \ 
from which the result follows by use of the summation formula for the geo¬ 
metric series. □ 

There is an interesting and unsolved problem connected with the function 
a(n). A number n is said to be perfect if a(n) = 2n. For example, 6 and 28 are 
perfect. In general, if 2 m+1 — 1 is a prime, then n = 2 m (2 m+1 — 1) is perfect, 
as can be seen by applying part (b) of Proposition 2.2.2. This fact is already in 
Euclid. L. Euler showed that any even perfect number has this form. Thus 
the problem of even perfect numbers is reduced to that of finding primes of 
the form 2 m+1 ― 1. Such primes are called Mersenne primes. The two out¬ 
standing problems involving perfect numbers are the following: Are there 
infinitely many perfect numbers? Are there any odd perfect numbers? 

The multiplicative analog of this problem is trivial. An integer n is called 
multiplicatively perfect if the product of the positive divisors of n is n 2 . Such 
a number cannot be a prime or a square of a prime. Thus there is a proper 
divisor d such that d ^ n/d. The product of the divisors 1, d, n/d, and n is 
already n 2 . Thus n is multiplicatively perfect iff there are exactly two proper 
divisors. The only such numbers are cubes of primes or products of two 
distinct primes. For example, 27 and 10 are multiplicatively perfect. 

We now introduce a very important arithmetic function, the Mobius fi 
function. For n e Z + ， " （ 1) = 1， = 0 if n is not square-free, and fi(PiP 2 - - * 
p t ) = ( — iy, where the p t are distinct positive primes. 

Proposition 2.2.3. If n > 1, Yjd\n = 0. 

Proof. lfn = v a iVi - - - V a i\ then Kd) = Z (El ，...， £l ) "(P! 1 … 冲 )， where the 

are zero or 1. Thus 

Z = i -, + 0 - G) + … + ( -1)’ = (i - i) 1 = o. n 


The full significance of the Mobius ^ function can be understood most 
clearly when its connection with Dirichlet multiplication is brought to light. 
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Let/and g be complex valued functions on Z + . The Dirichlet product of / 
and g is defined by the formula / o g(n) = where the sum is over 

all pairs {d u d 2 ) of positive integers such that d x d 2 = n. This product is 
associative, as one can see by checking that f o {g 。 h)(n) = (/ 。分 ） o h(n)= 
分 (d 2 )/i(d 3 )，where the sum is over all 3-tuples (d 1? d 2 , d 3 ) of positive 
integers such that = n. 

Define the function D by 11(1) = 1 and Q(n) = 0 for n > 1. Then /。 fl = 

S of = f. Define I by I(n) = 1 for all n eZ + . Then / o I(n) = I of (n)= 

Lemma. = = 

Proof. " 。 1(1) = "(1)/(1) = 1. If /i > 1 ， " 。 I(n) = fi(d) = 0. The same 
proof works for / ° [ 

Theorem 2 (Mobius Inversion Theorem). Let F(n) = Yjd\nf(d). Thenf(n)= 
Zdln Kd)F(n/d). 

Proof. F = f 。 I. Thus F = = f. This shows 

that / ⑻ =F 。 〆 《)= [化 fi(d)F(n/d). □ 

Remark. We have considered complex-valued functions on the positive 
integers. It is useful to notice that Theorem 2 is valid whenever the functions 
take their value in an abelian group. The proof goes through word for word. 

If the group law in the abelian group is written multiplicatively, the 
theorem takes the following form: If F(n) = f| d(n /(d), then f(n) = ridi« 
F(n/d)_. 

The Mobius inversion theorem has many applications. We shall use it to 
obtain a formula for yet another arithmetic function, the Euler </> function. 
For n 6 Z + , is defined to be the number of integers between 1 and n 
relatively prime to n. For example, (f)(1) = 1, (f)(5) = 4, = 2, and 

0(9) = 6. If p is a prime, it is clear that 0(p) — p — l. 

Proposition 2.2.4. <j)(d) = n. 

Proof. Consider the n rational numbers l/n 9 2/n, 3/n, …， (n — l)/n, n/n. 
Reduce each to lowest terms; i.e., express each number as a quotient of 
relatively prime integers. The denominators will all be divisors of n. If d\ n, 
exactly 4>(d) of our numbers will have d in the denominator after reducing to 
lowest terms. Thus Yjd\n <Kd) = n. □ 

Proposition 2.2.5. If n = pfp5 2 ... P ?、 then 

= n(l — (1/Pi))(l — (I/P 2 )) … （1 — (l/Pi))- 

Proof. Since n = ^ d)n <p(d) the Mobius inversion theorem implies that <j)(n) — 
Kd)n/d = « — + S /<y. n /PiPj ••• = «(!- (1/Pi))(l - (I/P 2 )) … 

(1 ~ (1 祕 □ 
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Later we shall give a more insightful proof of this formula. We shall also 
use the Mobius function to determine the number of monic irreducible 
polynomials of fixed degree in /c[x], where k is a finite field. 


§3 YAlP Diverges 

We began this chapter by proving that there are infinitely many prime 
numbers in Z. We shall conclude by proving a somewhat stronger statement. 
The proof will assume some elementary facts from the theory of infinite series. 

Theorem 3. ^ 1/p diverges，where the sum is over all positive primes in Z. 

Proof. Let Pi, P 2 , …， P ㈣ be all the primes less than n and define 又 ⑻ = 
n ⑦ U - Va)' 1 - Since (1 - l/pi)~ 1 = = 。 1 /於 we see that 

Kn) = Y J WP2--Pi i r\ 

where the sum is over all /-tuples of nonnegative integers (a 1? a 2 , ..., a t ). 
In particular, we see that 1 + * + ! + ••• + 1/n < X{n). Thus A(n) — 00 as 
n — co. This already gives a new proof that there are infinitely many primes. 
Next, consider log X(n). We have 

/ i 00 

log a ⑻ =-x iog(i - pr 1 ) = Z S ( m p?y 1 

i= 1 i = 1 m = 1 

=Pi — 1 + P2" 1 + .. + pr 1 + z z (mpTy 1 - 

i= 1 m= 2 

Now, Xm= 2 _(^D _1 < Em= 2 Pr m _= Pi~\} - Pi' 1 )' 1 ^ 如 「 2 . Thus log 义⑻ 
< 1 + P 2 1 + .. • + p 厂 1 + 2 (p「 2 + 2 + • • • + P/~ 2 ). It is well known 

that n~ 2 converges. It follows that Yji°= 1 Pi 2 converges. Thus if 

[p — 1 converged, there would be a constant M such that log X(n) < M, or 
X(n) < e M . This, however, is impossible since X(n) - ► 00 as n 00 . Thus 
YjP 1 diverges. □ 

It is instructive to try to construct an analog of Theorem 3 for the ring 
/c[x], where k is a finite field with q elements. The role of the positive primes 
p is taken by the monic irreducible polynomials p(x). The “size” of a monic 
polynomial / (x) is given by the quantity ^ deg/(x) . 

This is reasonable because for a positive integer n, n is the number of 
nonnegative integers less than n, i.e., the number of elements in the set 
{0, 1 ， 2, • • • ， n — 1}. Analogously, q de ^ fix) is the number of polynomials of 
degree less than deg/(x). This is easy to see. Any such polynomial has the 
form a 0 x m + a 1 x m " 1 + . • • + a m , where m = deg/(x) — 1 and a { e k. There 
are q choices for a { and the choice for each index is independent of the others. 
Thus there are q m+1 = ^ deg/(x) such polynomials. 
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Theorem 4. [ ^ -degp(JC) diverges, where the sum is over all monic irreducibles 
p(x) in /c[x]. 

Proof. We first show that [ ^ -deg/(x) diverges and that ^ ^ _2deg/(x) con¬ 
verges, where both sums are over all monic polynomials / (x) in fc[x]. Both 
results follow from the fact that there are exactly q n monic polynomials of de¬ 
gree n in Consider ^ deg/(JC) <„ ^ -deg/(x) . This sum is equal to ^ =0 q m q~ m 
=n + 1. Thus Yj q~ degf(x) diverges. Similarly, J]deg/(x) <n ^ _2de8/(x)= 
Yjn = o ci m q~ 2m < (1 — 1/ 分 ) _ 、 Thus Z 2deg ’ (x> converges. 

The rest of the proof is an exact imitation of the proof of Theorem 2. 
The reader should fill in the details. □ 


§4 The Growth of n{x) 

In the introduction to Chapter 1 we defined n(x) as the number of primes p, 
1 < p < x. The study of the behavior of n(x) for large x involves analytic 
techniques. We will prove in this section several results that require a mini¬ 
mum of results from analysis. In fact only the simplest properties of the 
logarithmic function are used. 

We begin with the following simple consequence of Euclid’s argument 
(Theorem 1) which gives a weak lower bound for n(x). Throughout log x 
denotes the natural logarithm of x. 

Proposition 2.4.1. n(x) > log(log x), x > 2. 

Proof. Let p n denote the nth prime. Then since any prime dividing p x p 2 … p n 
+ 1 is distinct from p x . ..., p n it follows that p n+l < + 1. Now 

p l < 2 (2l) , p 2 < 2 {2Z) and if p n < 2 (2n) then p n+l < 2 (21) - 2 (22> •. • 2 (2n) + 1 = 
2 2 ” +1_2 + 1 < 2 (2n + 1) . It follows that n(2 i2n) ) > n. For x > e choose an 
integer n so that e( e ” 一 1} < x < e (en) . If n > 3 then e n ~ 1 > 2 n so that 

n(x) > 7r(e (en_1) ) > n(e 2n ) > n(2 2n ) > n > log(log x). 

This proves the result for x > e e .lf x < e e the inequality is obvious. □ 

The method employed in the paragraph following Proposition 2.2.1 to 
show that n(x) -> oo can also be used to obtain the following improvement 
of the above proposition. If n is a positive integer let y(n) denote the set of 
primes dividing n. 

Proposition 2.4.2. n(x) > log x/2 log 2. 

Proof. For any set of primes S define f s (x) to be the number of integers n, 
l < n < x, with y(n) cz S. Suppose that S is a finite set with t elements. 
Writing such an n in the form n = m 2 s with s square free we see that m < y/x 
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while s has at most 2* choices corresponding to the various subsets of S. Thus 
/s(x) < Put n(x) = m so that p m+1 > x. If S = {p l9 ..., p m } then 

clearly f s (x) = x which implies that x < 2 m y/x = 2 n(<x) ^/x. The result follows 
immediately. □ 

It is interesting to note that the above method can also be used to give 
another proof to Theorem 2. For if [ l/p n converged then there is an n such 
that Yj>n 1/Pj < i* If S = {p 1? … ， p n } then x — f s (x) is the number of 
positive integers m < x with y(m) ^ S. That is, there exists a prime p』，j > n 
such that pj \m. For such a prime there are lx/pj] multiples of pj not exceeding 
x. Thus 

x ~ fs( x ) ^ Z — ^ Z — < 

j>n [_Pjj j>n Pj 2 

so that f s (x) > x/2. On the other hand, f s (x) < 2 n s /x. These inequalities 

imply 2 n > ^/x/2 which is false for n fixed and large x. 

A function closely related to n(x) is defined by 6(x) — Y,p<x log P，the 
sum being over all primes at most x. We will use 6(x) to bound n(x) from 
above. Put 0(1) = 0. 


Proposition 2.4.3. 6(x) < (4 log 2)x. 
Proof. Consider the binomial coefficient 



(n + 1) … (2n) 
1 • 2 … n 


Clearly this integer is divisible by all primes p, n < p < 2n. Furthermore, 
since 


Hence 


and therefore 


2n 


(i + i) 2n = z 

j=0 






p<2n 

> Up 

p>n 


p<2n 

In log 2 > ^ log p — 9(2n) — 6(n). 

p>n 


Summing this relation for n = 1 ， 2, 4, 8, ... ， 2 m ~ 1 gives 

0(2, < (log2)(2 m+1 - 2) 

< (log 2)2 m+1 . 
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If 2 m ~ 1 < x < 2 m we obtain 

6(x) < 0(2, < (log 2)2 m + 

< (4 log 2)x. 


(4 log 2)2, 


- 1 


□ 


Corollary 1. There is a positive constant c x such that n(x) < c^/log x for 
x >2. 


Proof. 


0(x) > £ logp 

P> y/X 


> (log Jx){n(x) - n{Jx)) 

> (log y/x)n{x) - y/x log yjx. 


Thus 


n(x) < 


26(x) 
log x 



< (8 log 2) 


x 


log X 



X. 


The result follows by noting that < 2x/log xfovx> 2. 
Corollary 2. n(x)/x 0 as x oo. 


□ 


To bound n(x) from below we begin by examining further the binomial 
coefficient ( 2 n n ). First of all 


2n 

n 


n + 


n + 2' 

~T 


n n 


n 


> 2 r 


On the other hand by Exercise 6 at the end of this chapter we have 


ord 


pi 


’2n 、 


n 


ord 


p 


(2n)\ 

W 


2 


t P 

z 


2n 

P j 


2 


n 

P J 


where t p is the largest integer such that p tp < 2n. Thus t p = [log 2n/log p]. 
Now it is easy to see that [2x] — 2[x] is always 1 or 0. It follows that 


ord 


p 


f；) 


< 


log In 
logp 


Proposition 2.4.4. There is a positive constant c 2 such that n(x) > c 2 (x/log x). 
Proof. By the above we have 

2 n < ( 2n ) < n P tp - 


n 


p<2n 



Notes 


25 


Thus 


n\og2< J] t p log p= X 

p<2rt p<2n 


log In 
log p 


log p. 


If log p > ^ log 2n, i.e., p > -Jin, then [log 2n/\og p] = 1. Thus 


n log 2 < 


I 

p<y/2n 


log 2n 
log/? 


logp + 


p<2n 



p> y/2n 


p 


< log In + 6(2n). 

Therefore d(2n) > n log 2 — Jin log 2n. But log 2n/n approaches 0 
as n — oo, so that 9(2n) > Tn for some T > 0 and all n sufficiently large. 
Writing, for large x, 2n < x < 2n l we have 9(x) > 0(2n) > Tn > 
T(x — 1)/2 > Cx for a suitable constant C. Thus there is a constant c 2 > 0 
such that 0(x) > c 2 x for all x > 2. To complete the proof we observe that 


9(x) = ^ log p < n(x) log x. 

p<x 

Thus 


n(x) > 


logx 


> 


x 


C 2 


logx 


□ 


The preceding two propositions were first proven by Tchebychef in 1852. 
These results are subsumed under the famous prime number theorem which 
asserts that in fact 7r(x)(log x/x) 1 as x oo. It is not difficult to see that 
this is equivalent to 6(x)/x - > 1 as x — oo. The prime number theorem was 
conjectured, in a slightly different form by Gauss at the age of 15 or 16. The 
proof of the conjecture was not achieved until 1896 when J. Hadamard and 
Ch. de la Valle Poussin established the result independently. Their proofs 
utilize complex analytic properties of the Riemann zeta function. In 1948 
Atle Selberg was able to prove the result without the use of complex analysis. 


Notes 

There are a multitude of unsolved problems in the theory of prime numbers. 
For example, it is not known if there are infinitely many primes of the form 
n 2 -f 1. On the other hand we will prove in Chapter 16 that the linear poly¬ 
nomial an + 6 always represents an infinite number of primes when (a, b) = 1. 
This is the celebrated theorem of Dirichlet on primes in an arithmetic pro¬ 
gression. 

It is not known whether there exist infinitely many primes of the form 
2 N + 1, the so-called Fermat primes, or if there are infinitely many primes of 
the form 2 N — 1, the Mersenne primes. 

Another outstanding problem is to decide whether there are an infinite 
number of primes p such that p + 2 is also prime. It is known that the sum 
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of the reciprocals of the set of such primes converges, a result due to Viggo 
Brun [52]. 

Good discussions of unsolved problems about primes may be found in 
W. Sierpinski [71] and Shanks [70]. Readers with a background in analysis 
should read the paper by P. Erdos [31] as well as those of Hardy [38] and 
[39]. 

The key idea behind the proof of Theorem 2 is due to L. Euler. A pleasant 
account of this for the beginner is found in Rademacher and Toeplitz [65]. 

Theorem 3 gives a proof in the spirit of Euler that fc[x] contains infinitely 
many irreducibles. This already suggests that many of the theorems in classical 
number theory have analogs in the ring /c[x]. This is indeed the case. An 
interesting reference along these lines is L. Carlitz [10]. 

The theorem of Dirichlet mentioned above has been proved for k[x\ k a 
finite field, by H. Kornblum [50]. Kornblum had his promising career cut 
short after he enlisted as Kriegsfreiwilliger in 1914. The prime number 
theorem also has an analog in /c[x]. This was proved by E. Artin in his 
doctoral thesis [2]. 

A good introduction to analytic number theory is Chandrasekharan [112]. 
In the last chapter of this very readable text a proof of the prime number 
theorem is given that uses complex analysis. Proofs that are free of complex 
analysis (but not of subtlety) have been given by A. Selberg [215] and 
P. Erdos [133]. For an interesting account of the history of this theorem see 
L. J. Goldstein [139]. Finally we recommend the remarkable tract Prim- 
zahlen by E. Trost [229]; this 95 page book contains, in addition to many 
elementary results concerning the distribution of primes, Selberg’s proof of 
the prime number theorem as well as an “elementary” proof of Dirichlet’s 
theorem mentioned above. See also D. J. Newman [198]. 


Exercises 


1. Show that /c[x], with k a finite field, has infinitely many irreducible polynomials. 

2. Let p Xi p 2i ..., p t e Zbe primes and consider the set of all rational numbers r = a/b, 
a, b e Z ， such that ord Pi a > ord Pi b for i = 1, 2,..., r. Show that this set is a ring 
and that up to taking associates p” p 2 , ... ， Pt are the only primes. 

3. Use the formula for (f)(n) to give a proof that there are infinitely many primes. 
[Hint: If p lt p 2 , … ， p t were all the primes, then (j)(n) = 1, where n = p x p 2 - - - p f .] 

4. If a is a nonzero integer, then for n > m show that (a 2n + 1, a 2 ^ + 1) = 1 or 2 
depending on whether a is odd or even. (Hint: If p is an odd prime and p\a 2m + 1, 
then p\a 2n — 1 for « > m.) 

5. Use the result of Exercise 4 to show that there are infinitely many primes. (This proof 
is due to G. Polya.) 


6. For a rational number r let [r] be the largest integer less than or equal to r, e.g., 
[ 士 ] = 0, [2] = 2, and [3 士 ] = 3. Prove ord p n \ = [n/p] -h [«/p 2 ] + [n/p 3 ] + … . 

7. Deduce from Exercise 6 that ord p n ! < n/(p - 1) and that < Y\ P \ n \P il{p ~ l) - 
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8. Use Exercise 7 to show that there are infinitely many primes. IHint: (n\) 2 > 
(This proof is due to Eckford Cohen.) 

9. A function on the integers is said to be multiplicative if / (ab) = / (a)f (b), whenever 
(a, b) = 1. Show that a multiplicative function is completely determined by its value 
on prime powers. 

10. If / (n) is a multiplicative function，show that the function g(n) = / (d) is also 

multiplicative. 

11. Show that (p(n) = n \x{d)jd by first proving that n(d)/d is multiplicative and then 
using Exercises 9 and 10. 

12. Find formulas for fi(d) 2 (p(d) 2 , and 

13. Let (T k (n) = d k . Show that <j k (n) is multiplicative and find a formula for it. 

14. If / (n) is multiplicative, show that h(n) = Yjd\ n Kn/d)f(d) is also multiplicative. 

15. Show that 

(a) ^{nld)v{d) = 1 for all n. 

(b) ^| n fi(n/d)a(d) = n for all n. 

16. Show that v(n) is odd iff n is a square. 

17. Show that <r(«) is odd iff « is a square or twice a square. 

18. Prove that 0(«)0(m) = (p((n, m))(j)([n, m]). 

19. Prove that 0(mn)0((m, n)) = (m, n)0(m)0(n). 

20. Prove that [^ d)n d = n x(n)/2 . 

21. Define A (n) = log pifnisa power of p and zero otherwise. Prove that Yjd\ n ^( n /d) log d 
= A (n). [Hint: First calculate A (d) and then apply the Mobius inversion 
formula.] 

22. Show that the sum of all the integers t such that l < t < n and (t, n) = 1 is jn<f>(n). 

23. Let /(x) e Z[x] and let i//(n) be the number of f(J)J — 1 ， 2, ... ， n，such that (/ (j), n) 
= 1. Show that ij/(n) is multiplicative and that i/zip 1 ) = p l ~ V(p)- Conclude that 

棒 ） =n n p |« _/P. 

24. Supply the details to the proof of Theorem 3. 

25. Consider the function (( 5 ) = [^° =1 l/n s . C(s) is called the Riemann zeta function. It 
converges for s > 1. Prove the formal identity (Euler’s identity) (( 5 ) n 
(l/p s )) 一 l . If we let s assume complex values, it can be shown that C(s) has an analytic 
continuation to the whole complex plane. The famous Riemann hypothesis states 
that the only zeros of ((s) lying in the strip 0 < Re 5 < 1 lie on the line Re s = 士 . 

26. Verify the formal identities 

(a) C(s) -1 = Kn)/n s . 

(b) CW 2 = X«°°= 1 v ⑻ 〆. 

(C) C(s)C(s - 1) = X«°°= 1 咖 、 l n ' 

27. Show that Y! l/ w , the sum being over square free integers, diverges. Conclude that 
n P < N (1 + \/p) - ► 00 as N 00 . Since e x > l + x, conclude that Yjp<n 1/p 
(T his proof is due to I. Niven.) 



Chapter 3 

Congruence 


Gauss first introduced the notion of congruence in Dis- 
quisitiones Arithmeticae {see Notes in Chapter \). It is 
an extremely simple idea. Nevertheless，its importance 
and usefulness in number theory cannot be exaggerated. 

This chapter is devoted to an exposition of the simplest 
properties of congruence. In Chapter 4, we shall go into 
the subject in more depth. 


§ 1 Elementary Observations 


It is a simple observation that the product of two odd numbers is odd, the 
product of two even numbers is even, and the product of an odd and even 
number is even. Also, notice that an odd plus an odd is even, an even plus an 
even is even, and an even plus an odd is odd. This information is summarized 
in Tables 1 and 2. Table 1 is like a multiplication table and Table 2 like an 
addition table. 


Table 1 Table 2 



e 

0 


e 

0 

e 

e 

e 

e 

e 

0 

o 

e 

0 

0 

0 

e 


These observations are so elementary one might ask if anything interesting 
can be deduced from them. The answer, surprisingly, is yes. 

Many problems in number theory have the form; if/is a polynomial in 
one or several variables with integer coefficients, does the equation / = 0 
have integer solutions? Such questions were considered by the Greek 
mathematician Diophantus and are called Diophantine problems in his 
honor. 

Consider the equation x 2 — 117;c + 31 = 0. We claim that there is no 
solution that is an integer. Let n be any integer, n is either even or odd. If n 
is even, so is n 2 and llln. Thus n 2 — llln + 31 is odd. If n is odd, then n 2 
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and Win are both odd. Thus n 2 — Win H- 31 is odd in this case also. Since 
every integer is even or odd, this shows that n 2 — 117n H- 31 is never zero. 

In Chapter 2 we showed that there are infinitely many prime numbers. 
We shall now show that there are infinitely many prime numbers that leave 
a remainder of 3 when divided by 4. Examples of such primes are 3, 7, 19, 
and 59. 

An integer divided by 4 leaves a remainder of 0, 1, 2, or 3. Thus odd 
numbers are either of the form 4k + 1 or 4 / + 3. The product of two numbers 
of the form 4/c + 1 is again of that form: (4/c + l)(4/c' + 1) = 4(4kk’ + k 
+ fc') + 1. It follows that an integer of the form 4/ + 3 must be divisible by 
a prime of the form 4 / + 3. 

Now, suppose that there were only finitely many positive primes of the 

form 4 / + 3. This list begins 3,7,11,19,23,_Let Pi = 7, p 2 = 1 1 ， p 3 = 19, 

etc. Suppose that p m is the largest prime of this form and set N = 4p 1 p 2 - - - 
p m + 3. iV is not divisible by any of the p { . However, N is of the form 4/4-3 
and so must be divisible by a prime p of the form 4/ + 3. We have p > p m , 
which is a contradiction. 

There is clearly some common principle underlying both arguments. We 
explore this in Section 2. 


§2 Congruence in Z 

Definition. If a, b,meZ and m / 0, we say that a is congruent to b modulo m 
if m divides b — a. This relation is written a 三 b (m). 

Proposition 3.2.1. 

( 3 ) a = a (m). 

(b) a 三 b (m) implies that b 三 a (m). 

(c) If a 三 b (m) and b 三 c (m), then a = c (m). 

Proof. 

(sC) a — a = 0 and m \ 0. 

(b) If mIfc — a, then m\a — b. 

(c) If m\b — a and m\c — b, then m\c — a = (c — b) (b — a). □ 

Proposition 3.2.1 shows that congruence modulo m is an equivalence 
relation on the set of integers. If a eZ, let a denote the set of integers congruent 
to a modulo m, a = {neZ\n = a (m)}. In other words a is the set of integers 
of the form a H- km. 

If m = 2, then 0 is the set of even integers and T is the set of odd integers. 
Definition. A set of the form a is called a congruence class modulo m. 
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Proposition 3.2.2. 

(a) a = Biff a = b (m). 

(b) a ^ B iff a nB is empty. 

(c) There are precisely m distinct congruence classes modulo m. 

Proof. 

(a) If B = a, then aea = B. Thus a 三 b (m). Conversely, if a = b (m), then 
aeh.li c = a (m), then c 三 b (m), which shows a ^ B. Since a 三 b (m) 
implies that b 三 a (m), we also have B ^ a. Therefore a = h. 

(b) Clearly, if a n 5 is empty, then a # B. We shall show that a n 15 not empty 
implies that a = B. Let cea nB. Then c = a (m) and c 三 b (m). It 
follows that a 三 b (m) and so by part (a) we have a = B. 

(c) We shall show that 0, T, 2,..., m — 1 are all distinct and are a complete 

set of congruence classes modulo m. Suppose that 0 <k<l<m.k = T 
implies that k = l (m) or that m divides l — k. Since 0 < l — k < m this 
is a contradiction. Therefore ^ Now let aeZ. We can find integers 
q and r such that a = qm r, where 0 < r < m. It follows that a 三 r (m) 
and that a— r. □ 

Definition. The set of congruence classes modulo m is denoted by Z/mZ. 

If a l5 a 2 ,..., a m are a complete set of congruence classes modulo m, then 
{a u a 2 ，…， a m } is called a complete set of residues modulo m. 

For example, {0, 1, 2, 3}, {4, 9, 14, 一 1}，and {0, 1 ， —2, — 1} are complete 
sets of residues modulo 4. 

The set Z/mZ can be made into a ring by defining in a natural way addition 
and multiplication. This is accomplished by means of the following proposi¬ 
tion. 

Proposition 3.2.3. If a = c (m) and b 三 d (m), then a b = c d (m) and 
ab 三 cd (m). 

Proof. If m\c — a and m\d — b, then m\(c — a) + (d — b) = (c d) — 
(a + b). Thus a + b = c d (m). 

Notice that cd — ab = c(d — b) b(c — a). Thus m\cd — ab and ab = 
cd (m). □ 

If a, 5 g Z/mZ, we define a + 5 to be a + and ah to be ab. 

This definition seems to depend on a and b. We have to show that they 
depend only on the congruence classes defined by a and b. This is easy. 

Assume that c = a and that 3 = 5. We must show that a b = c + d and 

that ab = cd, but this follows immediately from Propositions 3.2.2 and 3.2.3. 

With these definitions Z/mZ becomes a ring. The verification of this fact is 
left to the reader. 
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Table 3 Table 4 

Addition Multiplication 



0 

1 

2 


0 

1 

2 

0 

0 

1 

2 

0 

0 

0 

0 

1 

1 

2 

0 

1 

0 

1 

2 

2 

2 

0 

1 

2 

0 

2 

1 


Tables 3 and 4 give explicitly the addition and multiplication in Z/3Z. 
(Bars over the numbers are omitted.) The reader should construct similar 
tables for m = 4, 5, and 6. 

In discussing arithmetic problems it is sometimes more convenient to 
work with the ring Z/mZ than with the notion of congruence modulo m. On 
the other hand, it is sometimes more convenient the other way around. We 
shall switch back and forth between the two viewpoints as the situation 
demands. 

We proved earlier that the polynomial x 2 — 117x + 31 has no integer 
roots. It is possible to generalize this result using some of the material we 
have developed. 

If a = b (m), then a 2 = b 2 (m), a 3 = b 3 (m), and in general a n = b n (m). 
It follows that if p(x) g Z[x], then p(a) = p(b) (m). All this is a consequence 
of Proposition 3.2.3. 

Take m = 2. Then a is congruent to either 0 or 1 modulo 2 and we have 
p(a) = p(0) ⑵ or p(a) s p(l) (2). 

If p(x) = a 0 x n + a x x n ~ 1 + . • • + ^ + a "， then p(0) = a n and p(l)= 

a 0 a x + • • • a n . Our calculations yield the following result: If p(x) e 
Z[x] and p(0) and p(l) are both odd, then p(x) has no integer roots. 

x 2 — 117x + 31 has constant term 31， and the sum of the coefficients is 
— 85, both of which are odd. Other examples are 2x 2 + 2x-fl and 3x 3 + 
2x 2 + x + 3. 


§3 The Congruence ax = b (m) 

The simplest congruence is ax = b (m). In this section we shall develop a 
criterion to test this congruence for solvability, and if it is solvable, give a 
formula for the number of solutions. 

Before beginning we must give a definition of what we mean by the number 
of solutions to a congruence. Quite generally, let f(x u ..., x M ) be a poly¬ 
nomial in n variables with integer coefficients and consider the congruence 
f(x u ..., x n ) = 0 (m). A solution is an n-tuple of integers (a l9 such 

that f(a u a 2 , • .. ， a” ） 三 0 (m). If (b u . .. ， fc”）is another n-tuple such that 
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bi = (m) for i = 1， . • • ， n，then it is easy to see that f(b u ... ,b n ) = 0 (m). We 

do not want to consider these two solutions as being essentially different. Thus 
two solutions (a l5 ..., « w ) and (b l9 ..., fc n ) are called equivalent if = b t for 
i = 1 ， ... ， n. The number of solutions to/(x 1? ... ,x w ) = 0 (m) is defined to be 
the number of inequivalent solutions. 

For example, 3,8, and 13 are solutions to 6x = 3 (15). 18 is also a solution, 
but the solution x = 18 is equivalent to the solution x = 3. 

It is useful to consider the matter from another point of view. The map 
from Z to Z/mZ given by a -► a is a homomorphism. If/(« 1? = 0 (m), 

then / (a l5 ..., a w ) = 0. Here / (x u . ,. 9 x n )e Z/mZ[x 1? • • • ， x„] is the poly- 
nomiarobtained from /by putting a bar over each coefficient of f. One can 
now see that equivalence classes of solutions to f(x l9 ..., x„) = 0 are in one- 
to-one correspondence with solutions to /(x 1? … ， x n ) = 0 in the ring 
Z/mZ. This interpretation of the number of solutions arises frequently. 

We now return to the number of solutions of the congruence ax 三 b (m). 

Let d > 0 be the greatest common divisor of a and m. Set a! = a/d and 
m' = m/d. Then a' and m! are relatively prime. 

Proposition 3.3.1. The congruence ax 三 b (m) has solutions iffd\b. If d\b, then 
there are exactly d solutions. If x Q is a solution，then the other solutions are 
given by x 0 + m' ， x 0 + 2m'，• • • ， x 0 + (d — l)m'. 

Proof. If x 0 is a solution, then ax 0 — b = my 0 for some integer y 0 . Thus 
ax 0 — my 0 = b. Since d divides ax 0 — my 0 , we must have d | b. 

Conversely, suppose that d\b. By Lemma 4 on page 4 there exist integers 
x f 0 and y' 0 such that ax' 0 — myo = d. Let c = b/d and multiply both sides of 
the equation by c. Then a(x’ 0 c) — m(/ 0 c) = b. Let x 0 = x f 0 c. Then ax 0 = 
b (m). 

We have shown that ax 三 b (m) has a solution iff d\b. 

Suppose that x 0 and x x are solutions. 三 5 (m) and ax t = b (m) imply 
that a(x t — x 0 ) = 0 (m). Thus m|a(x! — x 0 ) and m f \a , (x l — x 0 ), which 
implies that m f \x l — x 0 or x x = x 0 + km’ for some integer k. One easily 
checks that any number of the form x 0 H- km' is a solution and that the solu¬ 
tions x 0y x 0 + m\ ..., x 0 + (d — l)m' are inequivalent. Let = x 0 H- km’ 
be another solution. There are integers r and 5 such that k = rd + s and 
0 < s < d. Thus x x = x 0 + sm' + rm and x x is equivalent to x 0 H- sm f . 
This completes the proof. □ 

As an example, let us consider the congruence 6x = 3 (15) once more. We 
first solve 6x — I5y = 3. Dividing by 3, we have 2x — 5y = l. x = 3, y = 1 
is a solution. Thus x 0 = 3 is a solution to 6x = 3 (15). Now, m = 15 and 
d = 3 so that m' = 5. The three inequivalent solutions are 3, 8, and 13. 

We have two important corollaries. 

Corollary 1. If a and m are relatively prime，then ax 三 b (m) has one and only 
one solution. 
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Proof. In this case d = 1 so clearly d\b, and there are d = 1 solutions. 匚 

Corollary 2. If p is a prime and a ^ 0 (p), then ax 三 b (p) has one and only one 
solution. 

Proof. Immediate from Corollary 1. 匚 

Corollaries 1 and 2 can be interpreted in terms of the ring Z/mZ. The 
congruence ax 三 b (m) is equivalent to the equation ax = B over the ring 

Z/mZ. 

What are the units of Z/mZ ? 云 eZ/mZ is a unit iff ax = T is solvable. 
ax = l (m) is solvable iff d| 1, i.e., iff a and m are relatively prime. Thus a is a 
unit iff (a, m) = 1, and it follows easily that there are exactly <f)(m) units in 
Z/mZ [see page 20 for the definition of (/>(m)]. 

If p is a prime and a # 0 is in Z/pZ, then (a, p) = 1. Thus every nonzero 
element of Z/pZ is a unit, which shows that Z/pZ is a field. 

If m is not a prime, then m = m 1 m 2 , where 0 < m l9 m 2 < m. Thus m] / D, 

# 0, but = m 1 m 2 = m = 0. Therefore Z/mZ is not a field. 
Summarizing we have 

Proposition 3.3.2. An element a of I/ml is a unit iff (a, m) = 1. There are 
exactly 0(m) units in Z/mZ. Z/mZ is a field iff m is a prime. 

Corollary 1 (Euler’s Theorem). If (a, m) = 1, then a^ (m) = 1 (m). 

Proof. The units in Z/mZ form a group of order (f>(m). If (a, m) = 1, a is a 
unit. Thus a^ (m) = T or a^ m) = 1 (m). □ 

Corollary 2 (Fermat’s Little Theorem). If p is a prime and pjf a, then a v ~ x = 

1 (P). 

Proof. If pjf a 9 then (a, p) = 1. Thus a 輸三 1 (p). The result follows, since 
for a prime p, <j)(p) = p — 1. □ 

It is possible to generalize many of the results in this section to principal 
ideal domains. 

The notions of congruence and residue class can be carried over to an 
arbitrary commutative ring. The first part of Proposition 3.3.1 is valid in a 
PID; i.e., ax 三 b (m) has a solution iff d\b and the solution is unique iff a 
and m are relatively prime. The only difference is that the number of solutions 
need not be finite. In any case, using this result one proves in analogy to part 
of Proposition 3.3.2 that if R is a PID and me R is not zero or a unit, then 
R/(m) is a field iff m is a prime. 

In particular, if fc is a field, then (x)) is a field iff/(x) is irreducible. 
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§4 The Chinese Remainder Theorem 

When the modulus m of a congruence is composite it is sometimes possible 
to reduce a congruence modulo m to a system of simpler congruences. The 
main theorem of this type is the so-called Chinese remainder theorem 
(Theorem 1), which we prove below. This theorem is valid for any PID (in 
fact, even more generally). However, we shall continue to work in Z and leave 
to the reader the relatively simple exercise of carrying over the proof for 
PID ， s. 

Lemma 1. If a u ... ,a t are all relatively prime to m, then so is a x a 2 • • • a 卜 

Proof. a { e Z/mZ is a unit. Thus so is 心〜 … 兩 = a 1 a 2 … A. By Proposition 
3.3.2, a 1 a 2 • • ^ is relatively prime to m. □ 

Another proof goes as follows. If a x a 2 - - a t was not prime to m, there 
would be a prime p that divides them both. p\a 1 a 2 - • a t implies that p\a t for 
some i. It follows that (a h m) / 1, which contradicts the hypothesis. 

Lemma 2. Suppose that a u ..a t all divide n and that (a h aj) = 1 for i # j. 
Then a 1 a 2 ... 4 divides n. 

Proof. The proof is by induction on t.lf t = 1, there is nothing to do. Sup¬ 
pose that t > 1 and that the lemma is true for t — 1. Then a x a 2 • • • a 卜 1 
divides n. By Lemma 1, a t is prime to a 1 a 2 •••«,_ ^ Thus there are integers r 
and s such that ra t + sa x a 2 • • • = 1. Multiply both sides by n. Inspection 

shows that the left-hand side is divisible by a x a 2 - • a t and the result follows. 

□ 

Theorem 1 (Chinese Remainder Theorem). Suppose that m = • • • m r 

and that (m h rrij) = 1 for i ^ j. Let b^ ， b 2 , … ， b t be integers and consider the 
system of congruences : 

x = b x (m!), x 三 b 2 (m 2 ),..., x = b t (m t ). 

This system always has solutions and any two solutions differ by a multiple 
of m. 

Proof. Letn f = m/m f . By Lemma 1, (m,, n f ) = 1. Thus there are integers r t and 
s { such that + s, n, = 1. Let e { — Then e t = 1 (m,) and e { = 0 (rrij) 
for j ^ i. 

Set x 0 = Y!i=i ^i e i' Then we have x 0 = (m,) and consequently 

x 0 e (mj x。is a solution. 

Suppose that x 1 is another solution. Then Xj — x 0 = 0 (m f ) for i = 
1, 2, ..., In other words, m l5 m 2 , ..., m t divide — x 0 . By Lemma 2, 
m divides x 1 ~ x 0 . □ 
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We wish to interpret Theorem 1 from a ring-theoretic point of view. If 
R l9 尺 2 , … ， 尺„ are rings, then 仏 ㊉尺 2 ㊉…㊉ 尺„ = 5 1 ，the direct sum of 
the R h is defined to be the set of n-tuples (r l9 r 2 ,..., r n ) with r t e R t . Addition 
and multiplication are defined by (r u r 2 , ... ， r„) + (r\, r’ 2 , …， O = ( r i + 

7*1，•••，"„ + 〜）Slid (广1， ^*2，• • • ， 〜). (广1，厂2， • • • ? ^n) = (疒1广1，广2广2， . • • ， ^n)' 

The zero element is (0, 0, • • • ， 0) and the identity is (1, 1,..., 1). we5isa unit 
iff there is a, veS such that uv = 1. If u = (u u •••,«„) and v — (v x , ..., v n ), 
then uv = l implies that = 1 for i = 1,..., n. Thus u { is a unit for each i. 
Conversely, if u ( is a unit for each i, then u = (u l9 w 2 , • •. ， wj is a unit. For a 
ring R we denote the group of units by U(R). UiR^ x U(R 2 ) x … x U(R n ) 
is the set of n-tuples (w l5 u 2 , . •. ， u n ), where R" This is a group under 
component-wise multiplication. We have shown 

Proposition 3.4.1. IfS = ㊉ _R 2 ㊉…㊉ then U(S) = U(R l ) x U(R 2 ) 

x U(R 3 ) x … x U(R n ). 

Let m l9 m 2 ,..., m, be pairwise relatively prime integers. 么 will denote the 
natural homomorphism from Z to We construct a map ^ from / to 

Z/rriiZ ® Z/m 2 Z ㊉…㊉ Z/m, Z as follows: ij/(n) = (^i(n), \j/ 2 (n )， …， 
\j/ t (n)) for all neZ.lt is easy to check that ^ is a ring homomorphism. What 
are the kernel and image of 0? 

(5 1? B 2 ,...,6() = iff \l/i(n) = Bi for i = 1 ,i.e., n = b ( (m f ) for 
i =l，..., t. The Chinese Remainder Theorem assures us that such an n 
always exists. Thus \j/ is onto. 

ij/(n) = 0 iff n 三 0 (m^, i = 1 ,..., r, iff n is divisible by m = m l m 2 . • • m r 
This is immediate from Lemma 2. Thus the kernel of ij/ is the ideal mZ. 

We have shown 

Theorem 1'. The map \)/ induces an isomorphism between Z/mZ and ® 

Z/m 2 Z ㊉…㊉ Z/m,Z. 

Corollary. U(Z/mZ) ^ x C/(Z/m 2 Z) x … x U(Z/m t Z). 

Proof. Immediate from Theorem V and Proposition 3.4.1. 匚 

Both sides of the isomorphism in the above corollary are finite groups. 
The order of the left-hand side is 0(m) and the order of the right-hand side is 
0(m 1 )0(m 2 ) - - - cj)(m t ). Thus cj)(m) = </>(m 1 )0(m 2 ) - - - 0(m r ). 

Let m = p a iP °2 .. • be the prime decomposition of m. We have 0(m)= 
0(PiO0(P2 2 ) * * * 0(pD- For a prime power, p a , (j)(p a ) = p a — p a ~ l , because 
the numbers less than p a and prime to p a are prime to p. Since p a /p = p a ~ l 
numbers less than p a are divisible by p, p a — p a ~ 1 numbers are prime to p. 
Notice that p a — p a ~ 1 = p a (\ — 1/p). It follows that (j>(m) = m 「 （1 — 1/p). 
We proved this formula in Chapter 2 in a different manner. 
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3 Congruence 


Let us summarize. In treating a number of arithmetical questions, the 
notion of congruence is extremely useful. This notion led us to consider the 
ring Z/mZ and its group of units (7(Z/mZ). To go more deeply into the struc¬ 
ture of these algebraic objects we write m = p a ip a 2 2 • •. p 广 and are led, via the 
Chinese Remainder Theorem, to the following isomorphisms: 

Z/mZ ^ Z/ptZ ㊉ Z/p a 2 2 Z ㊉…㊉ Z/p^Z, 

U(Z/mZ) ^ U(Z/p a ^Z) x U(Z/p a 2 2 Z) x . • • x U(Z(p^Z). 

For prime powers it is possible to push the investigation much further. 
This is the subject of Chapter 4. 


Notes 

It would be useful for the reader to consult other treatments of the basic 
material given here. See, for example, the very readable book of Davenport 
[22] and (again) Hardy and Wright [40]. See also Niven and Zuckerman 
[61], T. Nagell [60], E. Landau [52] and Vinogradov [77]. 

An interesting discussion of the various possible ways of arranging this 
material can be found in P. Samuel, “Sur l’organization d’un cours 
d’arithmetique，” L’Enseignment Math., 13, (1967), 223-231. A more advanced 
discussion of congruences is given in the first chapter of Borevich and 
Shafarevich [9]; this book also shows how the theory of congruences is 
useful in determining whether equations can be solved in integers. We 
mention also the beautiful treatment by J. P. Serre [69]. 

Historically the notion of congruences was first introduced and used 
systematically in Gauss’ Disquisitiones Arithmeticae. The notion of con¬ 
gruence is a wonderful example of the usefulness of employing the “right” 
notation. 

As far as the Chinese Remainder Theorem is concerned we note that 
Hardy and Wright [40] note that R. Bachman [4] notes that Sun Tsu was 
aware of this result in the first century a.d. The theorem is capable of vast 
generalizations. Properly formulated it holds in any ring with identity. 
Surprisingly it is no more difficult to prove in general than in the special 
case we have given (see Proposition 12.3.1). 


Exercises 

1. Show that there are infinitely many primes congruent to — 1 modulo 6. 

2. Construct addition and multiplication tables for Z/5Z, Z/8Z, and Z/10Z. 

3. Let abc be the decimal representation for an integer between 1 and 1000. Show that 
abc is divisible by 3 iff a + 6 + c is divisible by 3. Show that the same result is true if 
we replace 3 by 9. Show that abc is divisible by 11 iff a — 6 + c is divisible by 11. 
Generalize to any number written in decimal notation. 
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4. Show that the equation 3x 2 -h 2 = y 2 has no solution in integers. 

5. Show that the equation 7x 3 -h 2 = y 3 has no solution in integers. 

6 . Let an integer n > 0 be given. A set of integers a 1 , a 2 ,is called a reduced 
residue system modulo n if they are pairwise incongruent modulo n and (a t , n) = 1 
for all i. If (a, n) = 1, prove that aa 1? aa 2 , aa ♦⑻ is again a reduced residue system 
modulo n. 


1. Use Exercise 6 to give another proof of Euler’s theorem, a 少⑻三 1 (n) for (a, n) = 1. 

8. Let p be an odd prime. If /c e {1, 2,..., p — 1}, show that there is a unique b k in this 
set such that kb k = 1 (p). Show that k 羊 b k unless /c = lor/c = p — 1. 

9. Use Exercise 7 to prove that (p — 1)! = — 1 (p). This is known as Wilson’s theorem. 

10. If n is not a prime, show that (n — 1)! = 0 (n), except when n = 4. 

11. Let a 2 , a^ (n) be a reduced residue system modulo n and let N be the number of 
solutions to x 2 = 1 (n). Prove that a 1 a 2 - - - a ♦ ⑻三 (—1) N/2 (n). 


12. Let 



= p\/(k\(p — k )!) be a binomial coefficient, and suppose that p is a prime. 


If l < k < p — 1, show that p divides 



.Deduce (a -f l) p = a p + l (p). 


13. Use Exercise 12 to give another proof of Fermat’s theorem, a p ~ l = 1 (p) if p a. 

14. Let p and q be distinct odd primes such that p — 1 divides — 1. If (n, pq) = 1, 
show that n q ~ l = 1 (pq). 


15. For any prime p show that the numerator of 1 + i + i + * * * + 1/p — 1 is divisible 
by p. (Hint : Make use of Exercises 8 and 9.) 

16. Use the proof of the Chinese Remainder Theorem to solve the system x = 1 (7), 
x = 4 (9), x = 3 (5). 

17. Let / (x) e Z[x] and n = pVpV Show that f(x) = 0 (n) has a solution iff 

/(x) = 0 (pj*) has a solution for i = 1,2, 


18. Let N be the number of solutions to f(x) = 0 (n) and N ( be the number of solutions 
to /(x) = 0 (/??')• Prove that N = N^N 2 … N t . 

19. If p is an odd prime, show that 1 and — 1 are the only solutions to x 2 = 1 (p a ). 

20. Show that x 2 = 1 (2 b ) has one solution if b = l, two solutions if 6 = 2, and four 
solutions if 6 > 3. 


21. Use Exercises 18-20 to find the number of solutions to x 2 = 1 (n). 

22. Formulate and prove the Chinese Remainder Theorem in a principal ideal domain. 

23. Extend the notion of congruence to the ring Z[i] and prove that a + bi is always 
congruent to 0 or 1 modulo 1 + i. 

24. Extend the notion of congruence to the ring Z[o>] and prove that a + bco is always 
congruent to either — 1, 1, or 0 modulo 1 — 
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3 Congruence 


25. Let A = 1 — oj e Z[co]. If a g Z[co] and a = 1 (A), prove that a 3 = 1 (9). (Hint : Show 
first that 3 = —co 2 X 2 .) 

26. Use Exercise 25 to show that if Z\_co] are not zero and f 3 + 口 3 + C 3 = 0, then 

A divides at least one of the elements ^ rj, C. 



Chapter 4 

The Structure of U(Z/nZ) 


Having introduced the notion of congruence and discussed 
some of its properties and applications we shall now go 
more deeply into the subject. The key result is the existence 
of primitive roots modulo a prime. This theorem was used 
by mathematicians before Gauss but he was the first to 
give a proof. In the terminology introduced in Chapter 3 
the existence of primitive roots is equivalent to the fact 
that U(ZlpT) is a cyclic group when p is a prime. Using 
this fact we shall find an explicit description of the group 
U(Z!nI)for arbitrary n. 


§1 Primitive Roots and the Group Structure 
of U(Z/nZ) 

If n — p\ l p a 2 - - - Pi \ then, as was shown in Chapter 3, U(Z/nZ) ^ U(Z/p a x l Z) 
x ... x [/(Z/p^Z). Thus to determine the structure of U(Z/nZ) it is sufficient 
to consider the case U(Z/p a Z\ where p is a prime. We begin by considering 
the simplest case, U(Z/pZ). 

Since Z/pZ is a field, it will be helpful to have available the following 
simple lemma about fields. 

Lemma 1. Let f(x) e fe[x], k afield. Suppose that deg / (x) = n. Then f has at 
most n distinct roots. 

Proof. The proof goes by induction on n. For n = 1 the assertion is trivial. 
Assume that the lemma is true for polynomials of degree n — 1. If /(x) 
has no roots in k, we are done. If a is a root, / (x) = q(x)(x 一 a) + 〔 where r 
is a constant. Setting x = a we see that r = 0. Thus f(x) = q(x)(x — a) 
and deg q(x) —fl 一 1. If jS # a is another root of /(x), then 0 = /(jS)= 
(P — a)q(fi )， which implies that q(fi) = 0. Since by induction q(x) has at 
most n — 1 distinct roots,/(x) has at most n distinct roots. □ 

Corollary. Let /(x), ^(x)e/c[x] and deg/(x) = deg g(x) = n. If /(a)= 
g((Xi) for n + 1 distinct elements a 1? a 2 ,..., a„, a B+1 , then f(x) = g(x). 

Proof. Apply the lemma to the polynomial / (x) — g(x). 口 
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4 The Structure of C/(Z//iZ) 


Proposition 4.1.1. x p ~ 1 — 1 = (x — l)(x — 2) • • • (x — p + 1) (p). 

Proof. If a denotes the residue class of an integer a in Z/pZ, an equivalent 
way of stating the proposition is x p ~ 1 — T = (x — T)(x 一 2 ) … (x — (p 一 1)) 
in Z/pZ[x]. Let f(x) = (x p_1 — T) — (x — T)(x — 2) • • • (x — (p — T)). / (x) 
has degree less than p — 1 (the leading terms cancel)and has the p — 1 roots 
T, 2,..., p — 1 (Fermat’s Little Theorem). Thus / (x) is identically zero. □ 

Corollary, (p — 1)! = — 1 (p). 

Proof. Set x = 0 in Proposition 4.1.1. □ 

This result is known as Wilson’s theorem. It is not hard to prove that if 
n > 4 is not prime, then (n — 1)! = 0 (n) (see Exercise 10 of Chapter 3). 
Thus the congruence (n — 1)! = — 1 (n) is characteristic for primes. We shall 
make use of Wilson’s theorem later when discussing quadratic residues. 

Proposition 4.1.2. If d\p — 1, then x d = l (p) has exactly d solutions. 

Proof. Let dd r = p — 1. Then 
x p ~ l — 1 (x d Y — 1 

_ ——— =—~ s —— = (x d ) d， ~ l 4 - (x d ) d， ~ 2 + … + x d + 1 = g(x). 

X — 1 X — 1 

Therefore 

x p_1 — \ — (x d — l)g(x) 
and 

x p_1 — l = (x d — l)g(x\ 

If x d — T had less than d roots, then by Lemma 1 the right-hand side would 
have less than p — 1 roots. However, the left-hand side has the p — 1 roots 
T, 2,..., p — l. Thus x d = 1 (p) has exactly d roots as asserted. □ 

Theorem 1. U(Z/pZ) is a cyclic group. 

Proof. For d\p — 1 let ij/(d) be the number of elements in U(Z/pZ) of order 
d. By Proposition 4.1.2 we see that the elements of U(Z/pZ) satisfying 
x d = l form a group of order d. Thus Yjc\d <A(c) = d. Applying the Mobius 
inversion theorem we obtain = Yj：\d Kc)d/c. The right-hand side of this 
equation is equal to (p(d), as was seen in the proof of Proposition 2.2.5. 
In particular, \j/(p — 1) = (p(p — 1), which is greater than 1 if p>2. Since 
the case p = 2 is trivial, we have shown in all cases the existence of an element 
[in fact, cj)(p — 1) elements] of order p — 1. □ 

Theorem 1 is of fundamental importance. It was first proved by Gauss. 
After giving some new terminology we shall outline two more proofs. 
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Definition. An integer a is called a primitive root mod p if a generates the 
group U(Z/pZ). Equivalently, a is a primitive root mod p if p — 1 is the 
smallest positive integer such that a p ~ 1 = 1 (p). 

As an example, 2 is a primitive root mod 5, since the least positive residues 
of 2, 2 2 , 2 3 , and 2 4 are 2, 4, 3, and 1. Thus 4 = 5 — 1 is the smallest positive 
integer such that 2 n = 1 (5). 

For p = 7, 2 is not a primitive root since 2 3 = 1 (7), but 3 is since 3, 3 2 , 
3 3 , 3 4 , 3 5 , and 3 6 are congruent to 3, 2, 6, 4, 5, and 1 mod 7. 

Although Theorem 1 shows the existence of primitive roots for a given 
prime, there is no simple way of finding one. For small primes trial and error 
is probably as good a method as any. 

A celebrated conjecture of E. Artin states that ifa > 1 is not a square, then 
there are infinitely many primes for which a is a primitive root. Some progress 
has been made in recent years, but the conjecture still seems far from resolu¬ 
tion. See [35]. 

Because of its importance, we outline two more proofs of Theorem 1. The 
reader is invited to fill in the details. 

Let p — l = q\ l q e 2 2 … be the prime decomposition of p — 1. Consider 
the congruences 

(1) x q?i ~ l = 1 (p). 

(2) x qfi = 1 (p). 

Every solution to congruence 1 is a solution of congruence 2. Moreover, 
congruence 2 has more solutions than congruence 1. Let 仏 be a solution to 
congruence 2 that is not a solution to congruence 1 and set g = g t g 2 • • • g t . 
g ( generates a subgroup of U(Z/pZ) of order q\\ It follows that g generates a 
subgroup of C/(Z/pZ) of order q\ l q e 2 2 • • • ^ = p — 1. Thus g is 3, primitive 
root and U(Z/pZ) is cyclic. 

Finally, on group-theoretic grounds we can see that \j/(d) < for 
d\p — l.Both Xd|p-i and Yjd\ P -i 0(d) are equal top — l.It follows that 
稱 = : (j>(d) for all d|p — 1. In particular, \J/(p — 1) = (j>(p — 1). For p > 2, 
小 (P — 1) > 1, implying that \j/(p — 1) > 1. The result follows. 

The notion of primitive root can be generalized somewhat. 

Definition. Let a, n e Z. a is said to be a primitive root mod n if the residue class 
of a mod n generates U(Z/nZ). It is equivalent to require that a and n be 
relatively prime and that cj>(n) be the smallest positive integer such that 

三 1 (n). 

In general, it is not true that U(Z/nZ) is cyclic. For example, the elements 
of l/(Z/8Z) are T, 3, 5, 7, and T 2 = T, 3 2 = T, 5 2 = T, 7 2 = I Thus there is 
no element of order 4 = 0(8). It follows that not every integer possesses 
primitive roots. We shall shortly determine those integers that do. 
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4 The Structure of (7(Z/nZ) 


Lemma 2. If p is a prime and l < k < p, then the binomial coefficient (J) is 
divisible by p. 

Proof. We give two proofs. 

(a) By definition 

Now, p divides p !, but p does not divide kl (p — k) \ since this expression 
is a product of integers less than, and thus relatively prime to p. Thus p 
divides (J). 

(b) By Fermat’s Little Theorem a p ~ 1 = 1 (p) if p a. It follows that aP = 
a (p) for all a. In particular, (1 + a) p = 1 + a = 1 -f (p) for all a. 
Thus (1 + x) p — 1 — x p = 0 (p) has p solutions. Since the polynomial 
has degree less than p it follows from the corollary to Lemma 1 that 
(T + x) p — T — x p is identically zero in Z/pZ[x] 

(l + x) p -l-x p = V O. 

Thus (J) = 0 for 1 < fc < p — 1, implying that p|(J). The only interest 
in this proof is that we do not assume any information on (J). □ 

Lemma 3. /// > 1 and a 三 b (p l ), then aP = b p (p l+l )» 

Proof. We may write a = b + cp\ ceZ. Thus a p = b p -({)b p ~ x cp l + A, 
where A is an integer divisible by p l + 2 . The second term is clearly divisible 
by p l+l . Thus aP = b p (p l+l ). □ 

Corollary 1. // / > 2 and p # 2, then (1 4- ap) pl ~ 2 = 1 4 - ap l ~ l (p l ) for all 
aeZ. 

Proof. The proof is by induction on /. For / = 2 the assertion is trivial. 
Suppose that it is true for some / > 2. We show that it is then true for / + 1. 
Applying Lemma 3 we obtain 

(1 + ap) pl = (1 -f ap l ~ x ) p (p l+1 )- 
By the binomial theorem 

(i + a P l - l y = i + O 卜 1 + 尽 

where Bis a sum of p — 2 terms. Using Lemma 2 it is easy to see that all these 
terms are divisible by p 1 + 2(f_1) except perhaps for the last term, a p p p(l ~ Y \ 
Since l > 2, 1 + 2(/ — 1) > / + 1, and since also p > 3, p(/ — 1) > / + 1. 
Thus p l+ 1 |J5 and (1 + ap) pl ~ l = 1 + ap l (p l+1 \ which is as required. □ 



P! 

k! (p — k) 


so that p! = k! (p — k)! 
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Before starting a second corollary we need a definition. 

Definition. Let a,neZ and (a, n) = 1. We say a has order e mod n if e is the 
smallest positive integer such that a e = 1 (n). This is equivalent to saying 
that a has order e in the group U(Z/nZ). 

Corollary 2. Ifp # 2 and p a 9 then p 卜 1 is the order of 1 + ap mod 〆 

Proof. By Corollary 1， （1 + ap) pl 三 1 + ap l (p /+1 ), implying that (1 4- 
ap ) pl " 1 = 1 (p l ) and thus that 1 + ap has order dividing p l ~ l . (1 + ap) pl 2 = 
1 -f ap l ~ 1 (p l ) shows that p l ~ 2 is not the order of 1 + ap (it is here we use the 
hypothesis p 水 a). The result follows. □ 

We are now in a position to extend Theorem 1. It turns out that we shall 
have to treat the prime 2 separately from the odd primes. The necessity of 
treating 2 differently from the other primes occurs repeatedly in number 
theory. 

Theorem 2. Ifp is an odd prime and l g Z + , then t/(Z/p’Z) fs cyclic; i.e., there 
exist primitive roots mod p l . 

Proof. By Theorem 1 there exist primitive roots mod p. If gf e Z is a primitive 
root mod p, then so is ^ + p. If g p ~ l = 1 (p 2 )，then (g + p) p ~ l = g p ~ x + 
(p — l)g p ~ 2 p 三 1 + (p — l)g p ~ 2 p (p 2 ). Since p 2 does not divide (p — 1) 
x g p — 2 p we may assume from the beginning that g is a primitive root mod p 
and that g p ~ 1 ^ 1 (p 2 ). 

We claim that such a is already a primitive root mod p l . To prove this it 
is sufficient to prove that if g n : =1 (p l \ then 小 (p l ) = p l _ 1 (P — 1)|«. 

g p ~ l = 1 + ap, where p a. By Corollary 2 to Lemma 3, p l ~ l is the order 
of 1 + ap mod p l . Since (1 + ap) n = 1 (p l ) we have p l ~ 1 |n. 

Let n = p l 一 1 n f . Then g n = {g pl l ) n = g n (p), and therefore g n = 1 (p). 
Since 分 is a primitive root mod p，p — l\n\ We have proved that 
p l ~ i (p — l)\n, as required. □ 

Theorem 2\ 2 l has primitive roots for l = l or 2 but not for l > 3. If l > 3, then 
{( — l) a 5 b \a = 0, 1 and 0 < b < 2 l ~ 2 } constitutes a reduced residue system 
mod 2 l . It follows that for l > 3, U(Z/2Z) is the direct product of two cyclic 
groups, one of order 2, the other of order 2 l ~ 2 . 

Proof. 1 is a primitive root mod 2, and 3 is a primitive root mod 4. From now 
on let us assume that / > 3. 

We claim that (1) 5 2< ~ 3 = 1 + 2 卜 1 (2’). This is true for l — 3. Assume that 
it is true for / > 3 and we shall prove it is true for l + 1. First notice that 
(1 + 2 l ~ 1 ) 2 = 1 + + 2 2 卜 2 and that 2/ — 2 > / + 1 for / > 3. Applying 

Lemma 3 to congruence (1), we get (2) 5 2 卜 2 e 1 + 2’ (2 Z+1 ). Our claim is 
now established by induction. 
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4 The Structure of (7(Z//iZ) 


From (2) we see that 5 2 卜 2 三 1 (2)，whereas from (1) we see that 5 2， ~ 3 ^ 
1 (2 l ). Thus 2 卜 2 is the order of 5 mod 2 l . 

Consider the set {( — l) a 5 b \a = 1, 2 and 0 < b < 2 l ~ 2 }. We claim that 
these 2 l ~ l numbers are incongruent mod 2 l . Since 小 (2 1 ) = 2 卜 1 this will 
show that our set is in fact a reduced residue system mod 2 l . 

If ( — l) a 5 b = ( — l) a ， 5 b ， (2 l ), l > 3, then (— l) fl = ( — \) a， (4), implying that 
a = a! (2). Thus a = o!. Going further, a = a' implies that 5 b = 5 b (2’）or that 
5 b ~ b， = 1 (2 l ). Therefore, b = b f (2 l ~ 2 ), which yields b = V. 

Finally, notice that ( — l) a 5 b raised to the 2 l ~ 2 power is congruent to 1 
mod 2 l . Thus 2 l has no primitive roots if / > 3. □ 

Consider the situation mod 8. 1, 3, 5, and 7 constitute a reduced residue 
system. We have 5° = 1, 5 1 = 5, — 5° = 7, and —5 1 = 3. Table 1 represents 
the situation mod 16. The second row contains the least positive residues of 
the powers of 5, and the third row those of the negative powers of 5. 


Table 1 


5 ° 

5 1 

5 2 

5 3 

+ 1 

5 

9 

13 

- 15 

11 

7 

3 


Theorems 2 and 2' permit us to give a fairly complete description of the 
group U(Z/nZ) for arbitrary n. 

Theorem 3. Let n = 2 a p a l i p°2 •. • p? be the prime decomposition of n. Then 

U(l/nZ) ^ U(Z/2 a Z) x UiZ/p^Z) x … x U(I/p^Z). 

(7(Z/p^Z) is a cyclic group of order pV~ l (Pi - 1). U(Z/2 a Z) is cyclic of order 
1 and 2 for a = l and 2, respectively. If a > 3, then it is the product of two 
cyclic groups, one of order 2, the other of order 2 a ~ 2 . 

Proof. Theorems 2, 2\ and Theorem V of Chapter 3. □ 

We conclude this section by giving an answer to the question of which 
integers possess primitive roots. 

Proposition 4.1.3. n possesses primitive roots iff n is of the form 2, 4, p a , or 2p a , 
where p is an odd prime. 

Proof. By Theorem 2’ we can assume that n # 2 Z ，/ > 3. If n is not of the given 
form, it is easy to see that n can be written as a product m 1 m 2 , where (m u m 2 ) 
=1 and m u m 2 > 2. We then have that 0( m i) an d 0(m 2 ) are both even and 
that C/(Z/nZ) « [/(Z/m〆）x U(Z/m 2 Z). Both and U(Z/m 2 Z) 

have elements of order 2, but this shows that U(Z/nZ) is not cyclic since a 
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cyclic group contains at most one element of order 2. Thus w does not possess 
primitive roots. 

We already know that 2,4, and p a possess primitive roots. Since U(Z/2p a Z) 
^ U(Z/2Z) x U(Z/p a Z) ^ U(Z/p a Z) it follows that U(Z/2p a Z) is cyclic; 
i.e., 2p a possesses primitive roots. □ 


§2 nth Power Residues 

Definition. If m, neZ + 9 aeZ, and (a, m) = 1, then we say that a is an nth 
power residue mod mif x n = a (m) is solvable. 

Proposition 4.2.1. If m e Z + possesses primitive roots and (a, m) = 1, then a is 
an nth power residue mod m iff a 冷⑽ /d 三 1 (m), where d = (n, (j)(m)). 

Proof. Let g be a. primitive root mod m and a = g b ，x = g y . Then the con¬ 
gruence x n = a (m) is equivalent to g ny 三 g b (m), which in turn is equivalent 
to ny = b (4> (m)). The latter congruence is solvable iff d\b. Moreover, it is 
useful to notice that if there is one solution, there are exactly d solutions. 
If d\b 9 then a 例 d 三 g b<t>(m)/d = 1 (m). Conversely, if a^ m)ld = 1 (m), then 
三 i ( m ) ? which implies that cf)(m) divides or d\b. This proves 

the result. □ 

The proof yields the following additional information. If x n = a (m) is 
solvable, there are exactly (n, (/>(m)) solutions. 

Now suppose that m = 2 e p\ l - - - p e t l . Then x n = a (m) is solvable iff the 
system of congruences 

x n = a (2 e ), x n = a (pi 1 ),.. .,x n = a (pf l ) 

is solvable. Since odd prime powers possess primitive roots we may apply 
Proposition 4.2.1 to the last / congruences. We are reduced to a consideration 
of the congruence x n = a (2 e ). Since 2 and 4 possess primitive roots we may 
further assume that e > 3. 

Proposition 4.2.2. Suppose that a is odd, e > 3, and consider the congruence 
x n = a (2 e ). If n is odd, a solution always exists and it is unique. 

If n is even, a solution exists iff a = 1 (4), a 2e 2,d = 1 (2 e ), where d = 
(n, 2 e_2 ). When a solution exists there are exactly 2d solutions. 

Proof. We leave the proof as an exercise. One begins by writing a = ( — l) s 5 r 
(2 e ) and x = (-l) y 5 z (2 e ). □ 

Propositions 4.2.1 and 4.2.2 give a fairly satisfactory answer to the ques¬ 
tion; When is an integer a an nth power residue mod m? It is possible to go 
a bit further in some cases. 
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4 The Structure of U(Z/nZ) 


Proposition 4.2.3. If p is an odd prime ， p 氺 a ， and p 氺 n ， then if x n = a (p) is 
solvable，so is x n = a (p e ) for all e > 1. All these congruences have the same 
number of solutions. 

Proof. If n = 1, the assertion is trivial, so we may assume n > 2. Suppose 
that x n = a (p e ) is solvable. Let x 0 be a solution and set x x = x 0 4- bp e . A 
short computation shows x\ = x n 0 nbp e x n 0 ~ 1 (p e+l ). We wish to solve 
Xi = a (p e+1 ). This is equivalent to finding an integer b such that hxq~ i b = 
((a — Xo)/p e ) (p). Notice that (a — x n 0 )/p e is an integer and that pj^nx^ 1 . 
Thus this congruence is uniquely solvable for b, and with this value of b, 
x i = a (p e+1 ). 

If x n = a (p) has no solutions, then x n = a (p e ) has no solutions. On the 
other hand, if x n = a (p) has a solution, so do all the congruences x n = a (p e ), 
as we have just seen. By the remark following Proposition 4.2.1 the number 
of solutions to x n = a (p e ) is (n, (j) (p e )) provided one solution exists. If p 氺 n，it 
is easy to see that (n ， 0 (p)) = (n, </> (p e )) for all e> This concludes the 
proof. □ 

As usual the result for the powers of 2 is more complicated. 

Proposition 4.2.4. Let 2 l be the highest power of 2 dividing n. Suppose that a is 
odd and that x n = a (2 2Z+1 ) is solvable. Then x n = a (2 e ) is solvable for all 
e > 2/ -f 1 (and consequently for all e > 1). Moreover, all these congruences 
have the same number of solutions. 

Proof. We leave the proof as an exercise. One begins by assuming that 
x n = a (2 m ), m > 2/ -f 1, has a solution x 0 . Let = x 0 + b2 m ~ l . One shows, 
by an appropriate choice of b, that x n x ~ a □ 

Notice that x 2 三 5 (2 2 ) is solvable (for example, x = 1) but that x 2 — 
5 (2 3 ) is not. On the other hand, one can prove easily from the proposition 
that if a = 1 (8), then x 2 = a (2 e ) is solvable for all e and conversely. 

Notes 

Lemma 1 and its important consequence, Proposition 4.1.1, are due to 
J. Lagrange (1768). 

Fermat’s theorem [that a p ~ 1 = 1 (p) if pJ^a] was first proved by Euler. 
Wilson’s theorem was stated by E. Waring and proved by Lagrange. 

The important result on the existence of primitive roots modulo a prime 
was asserted by Euler and, as we have mentioned, was first proved by Gauss. 
The proofs of this result can be modified to prove the more general assertion 
that a finite subgroup of the multiplicative group of a field is cyclic, i.e., is 
generated by one element. 
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There are a number of interesting conjectures related to primitive roots. 
The celebrated conjecture of E. Artin asserts that given an integer a that is 
not a square, and not — 1, there are infinitely many primes for which a is a 
primitive root. In the case a = 10 this goes back to Gauss and amounts to 
asserting the existence of infinitely many primes p such that the period of the 
decimal expansion of 1/p has length p — 1. (See Chapter 4 of Rademacher 
[64] for an introduction to the theory of decimal expansions.) For an excellent 
survey article devoted to the Artin conjecture and related questions, see 
Goldstein [35]. 

Lehmer [54] discovered the following curious result. The first prime of 
the form 326n 2 + 3 for which 326 is not a primitive root must be bigger 
than 10 million. He mentions other results of the same nature. It would be 
interesting to see what is responsible for this strange behavior. 

Given a prime p, what can be said about the size of the smallest positive 
integer that is a primitive root mod p? This problem has given rise to a lot 
of research. One contribution, due to L. K. Hua, is that the number in ques¬ 
tion is less than 2 m+ 1 p 1/2 9 where m is the number of distinct primes dividing 
p — 1. For a discussion of this problem and a good bibliography, see Erdos 
[31]. For other interesting results and problems see [76] and [12]. 

There exist many investigations into the existence of sequences of con¬ 
secutive integers each of which is a feth power modulo p. Consider primes of 
the form kt + 1. A basic result due to A. Brauer asserts that if m is a given 
positive integer, then for all primes p sufficiently large there are m consecutive 
integers r, r 4- 1 ， . • • ， r + m — 1 all of which are kth powers modulo p. The 
question of finding the least such r for given p and m is a problem of current 
interest. For this, and a discussion of other open questions in this area, see 
the article by Mills [59]. 

Given a prime p, what can be said about the size of the smallest positive 
integer that is a nonsquare modulo p? An interesting conjecture is the 
following: For a given n the integer in question is smaller than for all 
sufficiently large p. For more discussion, see P. Erdos [31] and Chapter 3 
of Chowla [18]. 

Finally, we mention that an analog of the Artin conjecture on primitive 
roots has actually been proved in the ring fc[x] by H. Bilharz [8]. Bilharz 
proved his theorem under the assumption that the Riemann hypothesis 
holds for the so-called congruence zeta function (see Chapter 11). This was 
actually proved several years later by A. Weil. In recent years C. Hooley was 
able to prove that Artin’s orginal conjecture was correct under the assump¬ 
tion that the extended Riemann hypothesis holds in algebraic number fields 
[46]. For a discussion of the classical Riemann hypothesis and its conse¬ 
quences, see Chowla [18]. No one at present seems to have the slightest idea 
as to how to prove the Riemann hypothesis for number fields so that it seems 
clear that Hooley is not about to have the same good luck that Bilharz 
enjoyed. 
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4 The Structure of U(Z/nZ) 


Exercises 

1. Show that 2 is a primitive root modulo 29. 

2. Compute all primitive roots for p = 11 ， 13, 17, and 19. 

3. Suppose that a is a primitive root modulo p”，p an odd prime. Show that a is a primitive 
root modulo p. 

4. Consider a prime p of the form 4t + 1. Show that a is a primitive root modulo 
p iff — a is a primitive root modulo p. 

5. Consider a prime p of the form 4r + 3. Show that a is a primitive root modulo 
p iff — a has order (p — 1)/2. 

6. If p = 2” + 1 is a Fermat prime, show that 3 is a primitive root modulo p. 

7. Suppose that p is a prime of the form + 3 and that q = (p — 1)/2 is also a prime. 
Show that 2 is a primitive root modulo p. 

8. Let p be an odd prime. Show that a is a primitive root module p iff a ip ~ 1)/q ^k 1 (p) for 
all prime divisors qof p — 1. 

9. Show that the product of all the primitive roots modulo p is congruent to (— l)^ (p ~ 
modulo p. 

10. Show that the sum of all the primitive roots modulo p is congruent to fi(p — 1) 
modulo p. 

11. Prove that l fc + 2 fc + … + (p — l) k = 0 (p) if p — ljfk and — 1 (p) if p — l\k. 

12. Use the existence of a primitive root to give another proof of Wilson’s theorem 

(p - 1)! = -1 (p). 

13. Let Gbea finite cyclic group and g g Ga generator. Show that all the other generators 
are of the form g k , where (/c, n) — 1, n being the order of G. 

14. Let v4 be a finite abelian group and a, be A elements of order m and n, respectively. 
If (m, n) — 1, prove that ab has order mn. 

15. Let K be a field and G ^ K* sl finite subgroup of the multiplicative group of K. 
Extend the arguments used in the proof of Theorem 1 to show that G is cyclic. 

16. Calculate the solutions to x 3 = 1 (19) and x 4 = 1 (17). 

17. Use the fact that 2 is a primitive root modulo 29 to find the seven solutions to 
x 7 = 1 (29). 

18. Solve the congruence 1+x + x 2 + ••• + x 6 = 0 (29). 

19. Determine the numbers a such that x 3 = a (p) is solvable for p = 7, 11, and 13. 

20. Let p be a prime and d a divisor of p — 1. Show that the dth powers form a subgroup 
of U(Z/pZ) of order (p — \)/d. Calculate this subgroup for p = 11， d = 5; p = 17, 
d = 4; p = 19, d = 6. 

21. If 分 is a primitive root modulo p and d\p — 1, show that g {p ~ 1)ld has order d. Show also 
that a is a dth power iffa = g kd (p) for some k. Do Exercises 16-20 making use of these 
observations. 
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22. If a has order 3 modulo p, show that 1 + a has order 6. 

23. Show that x 2 = — 1 (p) has a solution iff p = 1 (4) and that x 4 = — 1 (p) has a 
solution iff p = 1 (8). 

24. Show that ax m + by n = c (p) has the same number of solutions as ax m， + by n， = c(p), 
where m' = (m, p — 1) and n' = (n,p — 1). 

25. Prove Propositions 4.2.2 and 4.2.4. 



Chapter 5 

Quadratic Reciprocity 


Ifp is a prime，the discussion of the congruence x 2 = a (p) 
is fairly easy. It is solvable iff a (p ~ 1)/2 = 1 (p). With this 
fact in hand a complete analysis is a simple matter. 
However, if the question is turned around，the problem is 
much more difficult. Suppose that a is an integer. For 
which primes p is the congruence x 2 = a (p) solvable ? 
The answer is provided by the law of quadratic reciprocity. 
This law was formulated by Euler and A. M. Legendre 
but Gauss was the first to provide a complete proof. 
Gauss was extremely proud of this result. He called it 
the Theorema Aureum, the golden theorem. 


§1 Quadratic Residues 

If (a, m) = 1, a is called a quadratic residue mod m if the congruence x 2 = 
a (m) has a solution. Otherwise a is called a quadratic nonresidue mod m. 

For example, 2 is a quadratic residue mod 7, but 3 is not. In fact, l 2 , 2 2 , 
3 2 , 4 2 , 5 2 , and 6 2 are congruent to 1, 4, 2, 2, 4, and 1, respectively. Thus 1, 2, 
and 4 are quadratic residues, and 3, 5, and 6 are not. 

Given any fixed positive integer m it is possible to determine the quadratic 
residues by simply listing the positive integers less than and prime to m, 
squaring them, and reducing mod m. This is what we have just done for 
m — 1. 

The following proposition gives a less tedious way of deciding when a 
given integer is a quadratic residue mod m. 

Proposition 5.1.1. Let m = 2 e p e l l • • • pf 1 be the prime decomposition of m，and 
suppose that (a, m) = 1. Then x 2 = a (m) is solvable iff the following conditions 
are satisfied: 

(a) If e = 2, then a = 1 (4). 

If e > 3, then a = 1 (8). 

(b) For each i we have a (Pi ~ i)l2 = 1 (/?,). 

Proof. By the Chinese Remainder Theorem the congruence x 2 = a (m) is 
equivalent to the system x 2 = a (2 e ), x 2 三 a (p e i) 9 ... 9 x 2 = a (p^ 1 ). 
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Consider x 2 = a (2 e ). 1 is the only quadratic residue mod 4, and 1 is the 
only quadratic residue mod 8. Thus we have solvability iff 0 三 1 (4) if ^ = 2 
and a = 1 (8) if ^ = 3. A direct application of Proposition 4.2.4 shows that 
x 2 = a (8) is solvable iffx 2 = a (2 C ) is solvable for all e > 3. 

Now consider x 2 = a (pf 1 )- Since (2, p t ) = 1 it follows from Proposition 
4.2.3 that this congruence is solvable iff x 2 三 a (p,) is solvable. To this 
congruence apply Proposition 4.2.1 with n = 2, m = p, and d = (n, cj> (m))= 
(2, p — 1) = 2. We obtain that x 2 = a (p { ) is solvable iff a iPi ~ 1)l2 = 1 (p f ). 

□ 

This result reduces questions about quadratic residues to the correspond¬ 
ing questions for prime moduli. In what follows p will denote an odd prime. 

Definition. The symbol (a/p) will have the value 1 if 0 is a quadratic residue 
mod p, — 1 if a is a quadratic nonresidue mod p, and zero if p | a. (a/p) is called 
the Legendre symbol. 

The Legendre symbol is an extremely convenient device for discussing 
quadratic residues. We shall list some of its properties. 

Proposition 5.1.2. 

⑻ a ip ~ i)l2 = (a/p) (p). 

(b) (ab/p) = (a/p)(b/p). 

(c) If a = b (/?), then (a/p) = (b/p). 

Proof. If p divides a or b, all three assertions are trivial. Assume that p X a and 
that p J(b. 

We know that a p ~ l = 1 (p )； thus (a ip ~ 1)/2 + l)(a (p_1)/2 — 1) = a p ~ l — 
1=0 (p). It follows that a ip ~ 1)/2 三土 1 (p). By Proposition 5.1.1, a (p ~ l)l2 = 
1 (p) iff a is a quadratic residue mod p. This proves part (a). 

To prove part (b) we apply part (a). (ab) (p ~ 1)/2 = (ab/p) (p) and (ab ) {p ~ 1)/2 
=a (p — 1)/2 b (p — 1)/2 三 (a/p)(b/p) (p). Thus (ab/p) = (a/p)(b/p) (p), which im¬ 
plies that (ab/p) = (a/p)(b/p). 

Part (c) is obvious from the definition. □ 

Corollary 1. There are as many residues as nonresidues mod p* 

Proof. a {p ~ l)l2 = 1 (p) has (p — 1)/2 solutions. Thus there are (p — 1)/2 
residues and p — 1 — ((p — 1)/2) = (p — 1)/2 nonresidues. □ 

Corollary 2. The product of two residues is a residue，the product of two 
nonresidues is a residue, and the product of a residue and a nonresidue is a 
nonresidue. 

Proof. This all follows easily from part (b). □ 

* In the remainder of this chapter “residues” and ‘‘nonresidues’’ refer to quadratic residues and 
quadratic nonresidues. 
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Corollary 3. (-1 广 - ” /2 = (-1/p). 

Proof. Substitute a = — 1 in part (a). □ 

Corollary 3 is particularly interesting. Every odd integer has the form 
4fc + 1 or 4fc + 3. Using this one can restate Corollary 3 as follows: x 2 = 
—1 (p) has a solution iff p is of the form 4k + 1. Thus — 1 is a residue of the 
primes 5, 13, 17, 29,… and a nonresidue of the primes 3, 7, 11, 19, — The 
reader should check some of these assertions numerically. 

One is led by this result to ask a more general question. If a is an integer, 
for which primes pis a quadratic residue mod p? The answer to this question 
is provided by the law of quadratic reciprocity to whose statement and proof 
we shall soon devote a great deal of attention. 

Corollary 3 enables us to prove that there are infinitely many primes of 
the form 4fc + 1. Suppose that p l5 p 2 ,. • • ,p m area finite set of such primes and 
consider (2p 1 p 2 - - - p m ) 2 + 1. Suppose that p divides this integer. — 1 will 
then be a quadratic residue mod p and thus p will be of the form 4fe + 1. p is 
not among the p t since (2p x p 2 - - - p m ) 2 + 1 leaves a remainder of 1 when 
divided by p,. We have shown that every finite set of primes of the form 
4fe + 1 excludes some primes of that form. Thus the set of such primes is 
infinite. 

To return to the theory of quadratic residues, we are now going to intro¬ 
duce another characterization of the symbol (a/p) due to Gauss. 

Consider S = { — (p — 1)/2, — (p — 3)/2, • • • ，— 1 ， 1 ， 2， . • • ， （p — 1)/2}. 
This is called the set of least residues mod p.UpJf a, let \i be the number of 
negative least residues of the integers a, 2a ， 3a,, ((p — \)/2)a. For example, 
let p = 7 and a = 4. Then (p — 1)/2 = 3, and 1. 4, 2 • 4, and 3 • 4 are con¬ 
gruent to —3, 1， and —2, respectively. Thus in this case " = 2. 

Lemma (Gauss’ Lemma), (a/p ) : = (-l 广 

Proof. Let 士 m, be the least residue of la, where m t is positive。As / ranges 
between 1 and (p — 1)/2, \x is clearly the number of minus signs that occur in 
this way. We claim that # m k if / ^ fc and l < I k < (p — 1)/2. For, if 
m x — m k9 then la 三 : tka (p), and since p a this implies that / 士 fc 三 0 (p). 
The latter congruence is impossible since / / k and | / 士 fc|<|/| + |fe|：^ 
p — l.lt follows that the sets {1, 2,(p — 1)/2} and {m l9 m 2 , … ， m (p _ 1)/2 } 
coincide. Multiply the congruences 1 • a 三 ±m 1 (p), 2 • a 三 士 m 2 (p), . • • ， 
((P — l)/2)a = 士 m( p _ 1V2 (p). We obtain 

This yields a (p ~ 1)/2 = ( — 1)^ (p). By Proposition 5.1.2, a (p ~ 1)/2 = (a/p) (p). 
The result follows. □ 

Gauss’s lemma is an extremely powerful tool. We shall base our first 
proof of the quadratic reciprocity law on it. Before getting to that, however, 
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we can use it immediately to get a characterization of those primes for which 
2 is a quadratic residue. 

Proposition 5.1.3. 2 is a quadratic residue of primes of the form 8fe + 1 and 
8fc + 7. 2 is a quadratic nonresidue of primes of the form 8fc + 3 and 8fc + 5. 
This information is summarized in the formula 

= (-旷 

Proof. We leave to the reader the task of showing that the formula is equiva¬ 
lent to the first two assertions. 

Let p be an odd prime (as usual) and notice that the number /x is equal to 
the number of elements of the set 2 1 ， 2.2 , …， 2 • (p — 1)/2 that exceed 
(p — 1)/2. Let m be determined by the two conditions 2m<(p — 1)/2 and 
2(m -f 1) > (p — 1)/2. Then /i = ((p — 1)/2) — m. 

If p = 8/c + 1， then (p — 1)/2 = 4k and m = 2k. Thus fi = 4k — 2k = 2k 
is even and (2/p) = 1. 

If p = 8/c + 7, then (p — 1)/2 = 4/c + 3, m = 2fc + 1， and /x = 4/c + 3 — 

(2k + 1) = 2fc + 2 is even. Thus (2/p) = 1 in this case as well. 

If p = 8fc + 3, then (p — 1)/2 = 4k 1, m = 2/c, and /x = 4/c + 1 — 2fc = 

2fc + 1 is odd. Thus (2/p) = — 1. 

Finally, if p = 8fc + 5 ， then (p — 1)/2 = 4/c + 2, m = 2fc + 1， and 
p = 4fc + 2 — (2fc + 1) = 2fc + 1 is odd. Thus (2/p) = —1 and we are done. 

□ 

As an example, consider p = 1 and p = 17. These primes are congruent 
to 7 and 1 ， respectively, mod 8, and indeed 3 2 = 2 (7) and 6 2 = 2 (17). On 
the other hand, p = 19 and p = 5 are congruent to 3 and 5, respectively, and 
it is easily checked numerically that 2 is a quadratic nonresidue for both 
primes. 

One can use Proposition 5.1.3 to prove that there are infinitely many 
primes of the form 8fc + 7. Let p 1? ... be a finite collection of such primes, 
and consider (4p 1 p 2 - - - p m ) 2 — 2. The odd prime divisors of this number 
have the form 8fc + 1 or 8fc + 7, since for such prime divisors 2 is a quadratic 
residue. Not all the odd prime divisors can have the form 8fc + 1 (prove it). 
Let p be a prime divisor of the form 8fc + 7. Then p is not in the set {p! ， p 2 ，…， 
p n } and we are done. 


§2 Law of Quadratic Reciprocity 

Theorem 1 (Law of Quadratic Reciprocity). Let p and q be odd primes. Then 

(b) (2/p) = ( —旷 

(c) (p/q)(q/p) = (-l) ((p ' 1)/2)(( ^' 1)/2) . 
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We are going to postpone the proof until Section 3. In Chapter 6 we shall 
prove the theorem once again from a different standpoint, and also indicate 
something of its history. It is among the deepest and most beautiful results of 
elementary number theory and the beginning of a line of reciprocity theorems 
that culminate in the very general Artin reciprocity law, perhaps the most 
impressive theorem in all number theory. It would take us far outside the 
compass of this book to even state the Artin reciprocity law, but in Chapter 9 
we shall state and prove the laws of cubic and biquadratic reciprocity. 

Parts (a) and (b) of Theorem 1 have already been proven and some of 
their consequences discussed. Let us turn our attention to part (c). 

If either p or q are of the form 4k -f 1, then ((p — l)/2)((q — 1)/2) = 0 (2). 
If both p and q are of the form 4k -j- 3, then ((p — 1)/2) ((q — 1)/2) = 1 (2). 
This permits us to restate part (c) as follows : 

(1) If either p or ^ is of the form 4fe + 1， then p is a quadratic residue mod q 
iff g is a quadratic residue mod p. 

(2) If both p and q are of the form 4k -j- 3, then pis a quadratic residue mod q 
iff is a quadratic nonresidue mod p. 

Asa first application of quadratic reciprocity we show how, in conjunction 
with Proposition 5.1.2, it can be used in numerical computations of the 
Legendre symbol. A single example should suffice to illustrate the method. 

We propose to calculate (79/101). Since 101 三 1 (4) we have (79/101)= 
(101/79) = (22/79). The last step follows from 101 = 22 (79). Further, 
(22/79) = (2/79)(11/79). Now 79 = 7 (8). Thus (2/79) = 1. Since both 11 
and 79 are congruent to 3 mod 4 we have (11/79) = —(79/11) = —(2/11). 
Finally 11 = 3 (8) implies that (2/11) = — 1. Therefore (79/101) = 1; i.e., 79 
is a quadratic residue mod 101. Indeed, 33 2 = 79 (101). 

The next application is perhaps more significant. We noticed earlier that 
—1 is a quadratic residue of primes of the form 4fc + 1 and that 2 is a quad¬ 
ratic residue of primes that are either of the form 8fc + 1 or 8fc + 7. If a is an 
arbitrary integer, for what primes p is a a quadratic residue mod p? We are 
now in a position to give an answer. To begin with, we consider the case 
where a = q,3in odd prime. 

Theorem 2. Let q be an odd prime. 

(a) If q = 1 (4), then q is a quadratic residue mod p iff p = r {q\ where r is a 
quadratic residue mod q. 

(b) If q = 3 (4), then q is a quadratic residue mod p iff p 三 士 b 2 (4q)，where b 
is an odd integer prime to q. 

Proof. If = 1 (4), then by Theorem 1 we have (q/p) = (p/q). Part (a) is thus 
clear. 

If ^ = 3 (4), Theorem 1 yields (q/p) = ( — l) (p — 1)l2 (p/q). Assume first that 
p e 士 b 2 (4q\ where b is odd. If we take the plus sign, we get p = b 2 = 1 (4) 
and p = b 2 (q). Thus ( — l) (p_ 1)/2 = 1 and (p/q) = 1, giving (q/p) = 1. If we 



§2 Law of Quadratic Reciprocity 


55 


take the minus sign, then p 三 —b 2 三 —1 = 3 (4) and p 三 一 b 2 (q). The 
first congruence shows that ( — l) (p_1)/2 = — 1. The second shows that (p/q)= 
( — b 2 /q) = ( — ^/q)(b/q) 2 = ( — 1/q) = —1 since q = 3 (4). Once again we 
have (q/p) = 1. 

To go the other way, assume that (q/p) = 1. We have two cases to deal 
with: 

(1) (-l)(m - 1 and (p/q) = -1. 

(2) (-l) (p ~ 1)/2 = 1 and (p/q) = 1. 

In case 2 we have p = b 2 (q) and p = 1 (4). b can be assumed to be odd 
since if it is even we can use b f = b + q instead. If b is odd, then b 2 = 1 (4) 
and p = b 2 (4) and thus p 三 b 2 (4q), as required. 

In case 1 we have p = 3 (4) and p = —b 2 (q). The last congruence follows 
since q = 3 (4) implies that every nonresidue is the negative of a residue 
(prove it). Again, we may assume that b is odd. In that case —b 2 = 3 (4) so 
p = —b 2 (4) and p — —b 2 (4q). This concludes the proof. □ 

Take ^ = 3 as a first illustration. By part (b) of Theorem 2 we must find 
the residues mod 12 of the squares of odd integers prime to 3. I 2 , 5 2 , 7 2 , and 
ll 2 are all congruent to 1. Thus 3 is a quadratic residue of primes p congruent 
to ± 1 (12) and a quadratic nonresidue of primes congruent to +5 (12). 

Next consider q = 5. Since 5 = 1 (4) we are in the simpler part (a) of 
Theorem 2.1 and 4 are the residues mod 5, and 2 and 3 the nonresidues. Thus 
5 is a residue of primes congruent to 1 or 4 mod 5 and a nonresidue of primes 
congruent to 2 or 3 mod 5. 

“Numbers congruent to b mod m” and “numbers of the form mfc + 6” are 
shorthand expressions describing the set {b，b ± m，6 土 2m，• • •}. This set is 
an arithmetic progression with initial term b and difference m. In our in¬ 
vestigations so far we have seen that the answer to the question for which 
primes p is a a quadratic residue has been for those primes p that occur in a 
certain fixed, finite number of arithmetic progressions. This situation is 
entirely general. Instead of stating this result as a theorem (the statement 
would be very complicated) we shall work out a few numerical examples. 

For a = — 3， ( — 3/p) = ( — l/p)(3/p). Thus —3 is a quadratic residue 
mod p if either (—l/p) = 1 and (3/p) — 1 or (—1/p) = —1 and (3/p) = —1. 

By our previous results the first case obtains when p 三 1 (4) and p = 
士 1 (12). If p = —1 (12)，then p = — 1 (4). The only primes that satisfy 
both congruences are ： 三 1 (12). 

In the second case p = 3 (4) and p e 土 5 (12). If p = 5 (12), then p = 1 (4). 
Thus the only primes that satisfy both these congruences are = —5 (12). 

Summarizing, —3 is a quadratic residue mod p iff p is congruent to 1 or 
— 5 mod 12. 

Now consider a = 6. Since (6/p) = (2/p)(3/p) we again have two cases : 
(2/p) = 1 and (3/p) = 1 or (2/p) = —1 and (3/p) = — 1. The first case holds 
if p = 1, 7 (8) and p 三 1,11 (12). The only two pairs of congruences that are 
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compatible are p = 1 (8) and p = 1 (12), and p = 7 (S) and p = 11 (12). By 
standard techniques (see Chapter 3) the primes satisfying these congruences 
are congruent to 1 or 23 mod 24. 

In the second case we have to consider p = 3, 5 (8) and p = 5, 7 (12). 
Separating these into four pairs of congruences we see that the only solutions 
are congruent to 5 and 19 mod 24. 

Summarizing, 6 is a quadratic residue mod p iff p e 1, 5, 19, 23 (24). 

As a numerical check we see for the primes 73, 5, 19, and 23 that 15 2 = 
6 (73), l 2 = 6 (5), 5 2 = 6 (19), and ll 2 = 6 (23). 

As a final application of the quadratic reciprocity law we investigate the 
question; if 0 is a quadratic residue mod all primes p not dividing a, what 
can be said about 0 ? If 0 is a square, it is a residue for all primes not dividing a. 
It turns out that the converse of this statement is true as well. In fact, we shall 
soon prove an even stronger result. First, however, it is necessary to define 
and investigate briefly a new symbol. 

Definition. Let b be an odd, positive integer and a any integer. Let b = 
p t p 2 •.. p m ，where the p t are (not necessarily distinct) primes. The symbol 
(a/b) defined by 



is called the Jacobi symbol. 

The Jacobi symbol has properties that are remarkably similar to the 
Legendre symbol, which it generalizes. A word of caution is useful, (a/b) may 
equal 1 without a being a quadratic residue mod b. For example, (2/15)= 
(2/3)(2/5) = (—1)(—1) = 1, but 2 is not a quadratic residue mod 15. It is 
true, however, that if (a/b) = — 1, then a is a quadratic nonresidue mod b. 

Proposition 5.2.1. 

(a) (ajb) = (a 2 /b) if a x = a 2 (b). 

(b) (a^Jb) = (ajb)(a 2 /b). 

(c) (a/b l b 2 ) = (a/b^ia/bj). 

Proof. Parts (a) and (b) are immediate from the corresponding properties 
of the Legendre symbol. Part (c) is obvious from the definition. □ 

Lemma. Let r and s be odd integers. Then 

(a) (rs - 1)/2 ^ ((r - 1)/2) + ((s - 1)/2) (2). 

(b) (r 2 s 2 - 1)/8 ^ ((r 2 - 1)/8) + (( 5 2 - 1)/8) (2). 

Proof. Since (r — l)(s — 1) = 0 (4) we have rs — 1 = (r — 1) (s — 1) (4). 
Part (a) follows by dividing by 2. 
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r 2 — 1 and s 2 — 1 are both divisible by 4. Thus (r 2 — l)(s 2 — 1) = 0 (16) 
and r 2 s 2 — 1 = (r 2 — 1) + (s 2 — 1) (16). Part (b) follows upon dividing by 8. 

□ 


Corollary. Let r u r 2 ,, r m be odd integers. Then 

⑻ i (G _ 1)/2 e (r x r 2 

(b) l ( r f — 0/8 = (rjrl • • * ^ — 1)/8 (2). 

Proof. The proof is a simple induction on m , using the lemma. □ 


Proposition 5.2.2. 

(a) (-l/b) = (-l) (b — 1)/2 . 

(b) (2/b) = (_if 2 -D/8. 

(c) If a is odd and positive as well as b 9 then 


m 




(_!)((«-D/2)((b-D/2) > 


Proof. 


(-l/b) = (-l/ Pl )(-l/p 2 ) … （ - l/p m ) = (-l) ( ^- 1)/2 … （ - 1 产， 

By the lemma Yj (p ( — 1)/2 = {PtPi •. • p m — 1)/2 = (b — 1)/2 (2). This 
proves part (a). 

Part (b) is proved in exactly the same way. 

Now if a = q x q 2 • • • &， then 






(_ l)Zi Ij ((«i - l>/2)((p 厂 1)/2). 


\ / \ / 1 J W 

The product and sum range over l < i < l and l <j < m. Again using the 
lemma we have 



This proves part (c). 


□ 


The Jacobi symbol has many uses. For one thing, it is a convenient aid for 
computing the Legendre symbol. We now use it to prove the following 
theorem. 


Theorem 3. Let a be a nonsquare integer. Then there are infinitely many 
primes p for which a is a quadratic nonresidue. 
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Proof. It is easily seen that we may assume that a is square-free. Let a : = 

• • • q n ，where the q t are distinct odd primes and ^ = 0 or 1. The case a = 2 
has to be dealt with separately. We shall assume to begin with that n > 1 ， i.e., 
that a is divisible by an odd prime. 



Let ,2 ， • 

.. , / k be a finite set of odd primes not including any q t . Let s be 

any nonresidue mod q n ， and find a simultaneous solution to the congruences 

X 

三 1 (,i )， 

i == i ， • • • ， k. 

X 

三 1 ⑻， 


X 

三 1 ⑹， 

1 1 ^ 2 ， •••，/! 1 . 

X 

三 s (q n \ 



Call the solution b. b is odd. Suppose that = PiP 2 • • is its prime 
decomposition. Since b = 1 (8) we have (2/b) = 1 and (qjb) = (b/q^) by 
Proposition 5.2.2. Thus (a/b) = (2/b) e (q 1 /b) - - - (q n _ i/b)(q n /b) = (b/q^ - -- 
(b/q n - i)(b/q n ) = ( 1 / 心） … (l/q n -^(s/qj = -L 

On the other hand, by the definition of (a/b), we have (a/b) = (a/p 1 )(a/p 2 ) 
• • • ( a /Pm). It follows that (a/pi) = — 1 for some i. 

Notice that lj does not divide b. Thus p t ^ {l l9 / 2 ,..., l k }. 

To summarize, if 0 is a nonsquare, divisible by an odd prime, we have 
found a prime p, outside a given finite set of primes {2, l u / 2 ,..., 4}，such 
that (a/p) = — 1. This proves Theorem 3 in this case. 

It remains to consider the case a = 2. Let / 1? ...,/ fc bea finite set of primes, 
excluding 3, for which (2//,) = — 1. Let b = SIJ 2 • • • / k + 3. 6 is not divisible 
by 3 or any / 卜 Since b = 3 (8) we have (2/b) = ( — l) (b2 — 1)/8 = — 1.' Suppose 
that b = p x p 2 • • • is the prime decomposition of b. Then, as before, we see 
that (2/pi) = — 1 for some i. {3, l u / 2 , … ， l k ). This proves Theorem 3 
for a = 2. □ 


§3 A Proof of the Law of Quadratic Reciprocity 

Gauss found eight separate proofs for the law of quadratic reciprocity. There 
are over a hundred now in existence. Of course, they are not all essentially 
different. Many just differ in small details from others. We shall present an 
ingenious proof due to Eisenstein. For a somewhat more elementary and 
standard proof, see [61]. 

A complex number (is called an nth root of unity if C n = 1 for some integer 
n > 0. If n is the least integer with this property, then C is called a primitive 
nth root of unity. 

The nth roots of unity are 1, e 2ni/n 9 e (2ltiln)2 , …， e {2niln)(n ~ 1} . Among these 
the primitive nth roots of unity are e {2ni,n)k 9 where (fc, n) = 1. 

If C is an nth root of unity and m = l (n), then = C’. If C is a primitive 

nth root of unity and C m = then m = l (n). 
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These elementary properties are easy to prove. 

Consider the function f(z) = e 2niz — e~ 2niz = 2i sin 2nz. This function 
satisfies f(z + 1) = f(z) and /( —z) = —/(z). Also, its only real zeros are 
the half integers. In other words, if r is a real number and 2r 丰 Z, then / (r) ^ 0. 

We wish to prove an important identity involving/(z), but first we need 
an algebraic lemma. 


Lemma. If n > 0 is odd，we have 

n — 1 

x n — y n = ]^[ (( k x — C~ k yX where C = e 2nl/n . 

k = 0 

Proof. 1, C, C 2 ? -*- ； C w_1 are all roots of the polynomial z n — 1. Since there are 
n of them and they are all distinct we have z” 一 1 = n?=o - Let 
z = x/y and multiply both sides by y n . We get x n -f = nz=o (^ - c k y\ 
Since n is odd as k runs over a complete system of residues mod w，so does 
— 2k. Thus 


x” 一 / 


= no - o 

k = 0 


n— l 

= ^-(l + 2 + - + n-l) j-J^ _ 

k = 0 


=n(C k x - r k y). 

k = 0 


In the last step we have used the fact that 1+2 + 3 + ••• + (« — 1)= 
n((n — 1)/2) is divisible by n. □ 


Proposition 5.3.1. If n is a positive odd integer and f(z)- =e 2mz — e— 2tuz , then 


f(nz) W , k\ ( k\ 

= f K + n) f [ Z - n} 

Proof. In the lemma, substitute x = e 2niz and y = e~ 2niz . We see that 

/(nz)= n/(z + ^). 

k=0 \ n / 

Notice that /(z + k/n) —f(z + k/n — \) —f{z — {n — k)/n). As k goes 
from (n + 1)/2 to n — l, n — k goes from (n — 1)/2 to 1. Thus 


finz) 

/(z) 


(” 一 1)/2 / K 

R f { Z + -n 


rt— 1 

n n^z 


(n-l)/2 

n /(z + 三 


k = (n+ 1)/2 

k 、 " 一 1 


k 

(n-l)/2 


n 


n /z- 


n 

n 


k 、 


k = (n+1)/2 


n 


’ J4 4H). 


□ 
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Proposition 5.3.2. If p is an odd prime, aeZ, and p J(" a, then 



Proof. As in the lemma of Section 1 ，三士 (p)，where 1 < < (p - 1)/2. 
Thus la/p and 土 mjp differ by an integer. This implies that/ (la/p) = /( 士 mjp) 

=±/(^i/pX 

The result now follows by taking the product of both sides as / goes from 
1 to (p — 1)/2 and applying Gauss’ lemma. □ 

We are now in a position to prove the law of quadratic reciprocity. Let p 
and q be odd primes. Then by Proposition 5.3.2 


(P-D/2 

n / 

i=i 



By Proposition 5.3.1 


JWp) 

7m 




Putting these two equations together we have 





In the same way we find 



(«-l)/2 (p-l)/2 

=n n f 

m=1 l= 1 





Since / (m/q — l/p) = —/ (l/p — m/q) we see that 


and therefore that 






= (_ l)((p-”/2>((g-lV2). 

The proof is complete. 匚 

We conclude this chapter by giving an equivalent formulation of the law 
of quadratic reciprocity. 



Proposition 5.3.3. Let p and q be distinct odd primes and a > 1 an integer. 
Then the following assertions are equivalent: 

⑻ (p/q)(q/p) = (-i) ((p ' 1)/2)(( "~ 1)/2) . 

(b) 三土 g (4a\ pjf a, then (a/p) = (a/q). 
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Proof. In order to show (a) implies (b) it is enough, by multiplicativity, to 
show that (b) holds when a is prime. For a — 2 the result follows from Propo¬ 
sition 5.1.3. If a is an odd prime then by (a) (a/p) — ( — l) ((p ~ 1)/2)((fl ~ l)l2) (p/a). 
If p = q (4a) then (p/a) = (q/a) so that 



(_l)((p- 


1)/2)((a-1)/2) 



a 




r iy(P~ l)/2)((a- 1 V2)( — l)/2)((0- 1)/2)/^ 

A 


r „ 

a 


(_!)((«-D/2)((P + q-2)/2)J^j 

V 


But p 三 q (4a) implies p + q — 2 = 0(4) and the result follows. If, on the 
other hand p 三 一 q (4a), a similar calculation shows 


(_l)((a-l)/2)((p + _ 




Since p + = 0 (4) the result also holds in this case. 

To show that (b) implies (a) suppose first of all that p > q and p 
The p = q + 4a 9 a > l. Thus 


9 ⑷. 





f q + 4a 

q 

(— 1)(P- l}/2 




( 7 ) 


p - q 
p 




p 



If p = 1 (4) then (p/q) = (q/p) which gives (a). If 三 3 (4) then q = 3 (4) 
and we obtain (p/q) = —(q/p) which is part (a) in that case. Finally if p = 
—q (4) then, p + q = 4a and 



q + 4a 

q 





Thus (p/q) — (q/p) which is the assertion of part (a) since in this case at least 
one of p or q must be congruent to 1 modulo 4. The proof is complete. □ 


Note that by part (b) of the above proposition we see that if (r, 4a) = 1 
the quadratic character of a is the same for all primes in the arithmetic 
progression r -f 4at, t eZ. In Chapter 16 we will see that infinitely many 
such primes exist. Note also that the quadratic character of a prime of the 
form r + 4at is the same as that for a prime of the form —r + 4at. It was in 
this form that Euler first discovered this most remarkable law. 


Notes 

Kronecker has pointed out that the law of quadratic reciprocity follows 
immediately from a conjecture of Euler contained in the paper “Theoremata 
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circa divisores numerorum in hac forma pa 2 土 qb 2 contentorum” （ 1744— 
1746). It also appears explicitly in a later paper of Euler entitled “Observa- 
tiones circa divisionem quadratorum per numeros primos.” Using sufficient 
conditions for the solvability of the equation ax 2 + by 2 + cz 2 = 0 (see 

y 

Proposition 17.3.2). Legendre (1785) was able to prove the result in special 
cases. For example, the consideration of x 2 + py 2 — qz 2 where p 三 1 (4) 
and q = 3 (4) leads to the conclusion that if ^ is a square modulo p then p 
is a square modulo q. The first complete proof of the theorem is due to Gauss 
who recorded the date of the proof in his diary on April 8, 1796. During his 
lifetime Gauss published six proofs of this remarkable law. The proof we 
have given in this chapter is taken from Eisenstein’s paper “Applications de 
l’Algebre a l’Arithmetique transcendante.” Kummer in an historical study 
of the laws of reciprocity, refers to this proof as one of the most beautiful of 
all the proofs (“•. • einen der schonsten Beweise dieses von den ausgezeich- 
netsten Mathematikern viel bewiesenen Theorems Replacing the 

trigonometric function by certain elliptic functions Eisenstein was able, 
without much more difficulty, to prove the laws of cubic and biquadratic 
reciprocity as well. 

Throughout the nineteenth century various mathematicians including 
Cauchy, Eisenstein, Dirichlet, Dedekind, and Kronecker gave new proofs 
to the law of quadratic reciprocity. By 1921 there were, according to P. 
Bachman, 56 known proofs. Even in recent times new proofs continue to 
appear. See, for example, the papers by M. Gerstenhaber [128] and R. Swan 
[75]. On the other hand, the first proof of Gauss has been reconsidered 
recently by E. Brown [99]. 

The Jacobi symbol is one generalization of the Legendre symbol. For an 
interesting generalization in another direction, see the paper of P. Cartier 

[14]. 

Quadratic reciprocity can be formulated in rings other than Z. Dirichlet 
proved such a theorem for the ring of Gaussian integers Z[i]. D. Hilbert was 
able to prove that quadratic reciprocity held for any algebraic number field, 
a result that was an important stepping stone to class field theory. In another 
direction it can be shown that reciprocity holds for the ring fc[x], where fc is a 
finite field. See Artin [2] and Carlitz [10]. This result had already been stated 
(though not proved) by Dedekind in 1857. 

The generalization of Theorem 3 to higher powers was discovered first by 
E. Trost in 1934.* Later it was stated as a conjecture by S. Chowla and sub¬ 
sequently proven by N. C Ankeny and C. A. Rogers . 卞 They proved that if 
x n = a (p) has a solution for all but a finite number of primes p, then either 
a = b n or n\S and a = 2 n/8 b n . When n is square-free and (a, n) = 1, the result 
can be shown to follow from the Eisenstein reciprocity law as was done by 
J. Kraft and M. Rosen [211]. Their proof will be given in Chapter 14. See 

* Zur Theorie der Potenzreste. Nieuw Arch. Wiskunde, 18, (1934), 15-61, 

卞 A conjecture of Chowla. Ann. Math., 53, No. 3 (1951), 541-550. 
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also H. Flanders [134] where the result is generalized to the case of algebraic 
number fields and algebraic function fields of one variable over a finite field. 


Exercises 

1. Use Gauss’ lemma to determine ( 寻 )’ （各 ) ，（各 )， and ( — i/p). 

2. Show that the number of solutions to x 2 = a (p) is given by 1 + (a/p). 

3. Suppose that pJfa. Show that the number of solutions to ax 2 + bx + c = 0 (P) is 
given by 1 + ((b 2 — 4ac)/p). 

4. Prove that (a/p) = 0. 

5. Prove that ^JIo((<3x + b)/p) = 0 provided that pJ(a. 

6. Show that the number of solutions io x 2 — y 2 = a (p) is given by 

¥ (1 + ((y 2 + a)/p)). 

y—0 

7. By calculating directly show that the number of solutions to x 2 — y 2 = a (p) is 
/? — 1 if p)(a and 2p — 1 if p\a. (Hint: Use the change of variables u = x + y, 
v = x — y.) 

8. Combining the results of Exercises 6 and 7 show that 

p y (y 2 + j-1 ， ifpjfa, 

y =o\ P / Ip - 1 ， 

9. Prove that 1 2 3 2 5 2 • • • (p — 2) 2 = ( — l) (p+1)/2 (p) by using Wilson’s theorem. 

10. Let r 1? r 2 ,..., r (p _ 1)/2 be the quadratic residues between 1 and p. Show that their 
product is congruent to 1 (p) if p = 3 (4) and congruent to —l(p)ifp= 1 (4). 

11. Suppose that p = 3 (4) and that ^ = 2p + 1 is also prime. Prove that 2 P — 1 is not 
prime. (Hint: Use the quadratic character of 2 to show that q\2 p — 1.) One must 
assume that p > 3. 

12. Let / (x) e Z[x]. We say that a prime p divides / (x) if there is an integer n such that 
p I / (n). Describe the prime divisors of x 2 + l and x 2 — 2. 

13. Show that any prime divisor of x 4 — x 2 + 1 is congruent to 1 modulo 12. 

14. Use the fact that U(Z/pZ) is cyclic to give a direct proof that ( — 3/p) = 1 when 
p = 1 (3). \_Hint : There is a p in U(Z/pZ) of order 3. Show that (2p + l) 2 = —3.] 

15. If p = 1 (5), show directly that (5/p) = 1 by the method of Exercise 14. ^Hint : Let p 
be an element of U(Z/pZ) or order 5. Show that (p + p 4 ) 2 + (p + p 4 ) — 1 = 0, 
etc.] 

16. Using quadratic reciprocity find the primes for which 7 is a quadratic residue. Do the 
same for 15. 

17. Supply the details to the proof of Proposition 5.2.1 and to the corollary to the lemma 
following it. 
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18. Let D be a square-free integer that is also odd and positive. Show that there is an 
integer b prime to D such that (b/D) = — 1. 


19. Let D be as in Exercise 18. Show that [ (a/D) = 0, where the sum is over a reduced 
residue system modulo D (see Exercise 6 of Chapter 3). Conclude that exactly one 
half of the elements in U(Z/DZ) satisfy (a/D) = 1. 

20. (continuation) Let a 2 ,. •. ， a 冷 (jD>/2 be integers between 1 and D such that 

D) = 1 and (aJD) = 1. Prove that D is a quadratic residue modulo a prime pJfD, 
p 三 1 (4) iff p = a ( (D) for some i. 


21. Apply the method of Exercises 19 and 20 to find those primes for which 21 is a 
quadratic residue. [^Answer: Those p 三 1 ， 4, 5, 16, 17, and 20 (21).] 

22. Use the Jacobi symbol to determine (113/997) ， (215/761), (514/1093), and (401/757). 

23. Suppose that p = l (4). Show that there exist integers s and t such that pt = 1 + s 2 . 
Conclude that p is not a prime in Z[/]. Remember that Z[/] has unique factorization. 

24. If p = 1 (4), show that p is the sum of two squares; i.e., p = a 2 -\- b 2 with a,beZ. 
(Hint : p = aP with cc and p being nonunits in Z[i]. Take the absolute value of both 
sides and square the result.) This important result was discovered by Fermat. 

25. An integer is called a biquadratic residue modulo p if it is congruent to a fourth 
power. Using the identity x 4 + 4 = ((x + l) 2 + l)((x - l) 2 + 1) show that —4 is a 
biquadratic residue modulo p iff p ~ 1 (4). 

26. This exercise and Exercises 27 and 28 give Dirichlet’s beautiful proof that 2 is a 
biquadratic residue modulo p iff p can be written in the form A 2 + 64B 2 , where 
A, BeZ. Suppose that p = 1 (4). Then p = a 2 + b 2 by Exercise 24. Take a to be odd. 
Prove the following statements: 

(a) (a/p) = 1. 

(b) ((a + b)/p) = (-l) ((a+&)2 ~ 1)/8 . 

(c) (a + b) 2 = lab (p). 

(d) (a + bY p ~ l)l2 = (2ab ) {p ~ 1)/4 (p). 

^Hint : 2p = (a + b) 2 + (a — b) 2 .~] 

27. Suppose that / is such that b = af (p). Show that f 2 三 — 1 (p) and that 2 {p ~ 1)/4 = 

f am (P). 

28. Show that x 4 = 2 (p) has a solution for p = 1 (4) iff p is of the form A 2 + 64B 2 . 

29. Let (RR) be the number of pairs (n, n + 1) in the set 1, 2,3, ... ， p — 1 such that n and 
« + 1 are both quadratic residues modulo p. Let (NR) be the number of pairs 
(«, n + 1) in the set 1,2,3,..., p — 1 such that m is a quadratic nonresidue and « + 1 
is a quadratic residue. Similarly, define (RN) and (NN). Determine the sums 
(RR) + (RN), (NR) + (NN), (RR) + (NR), and (RN) + (NN). 

30. Show that (RR) + (NN) — (RN) — (NR) = [::}(«(«+ l))/p. Evaluate this sum 
and show that it is equal to — 1. (Hint : The result of Exercise 8 is useful.) 

31. Use the results of Exercises 29 and 30 to show that (RR) = — 4 — e), where 

e = (-l) (p_1)/2 . 

32. If p is an odd prime show that (2/p) = WfJ\ l)l2 2 cos(2nj/p). Use this result to give 
another proof to Proposition 5.1.3. 
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33. Use Proposition 5.3.2 to derive the quadratic character of — 1. 

34. If p is an odd prime distinct from 3 show that (3/p) = \Yf=i ) 2 (3—4 sm 2 {2njlp)). 

35. Use the preceding exercise to show that 3 is a square modulo p iff p is congruent to 1 
or — 1 modulo 12. 

36. Show that part (c) of Proposition 5.2.2 is true if a is negative and b is positive (both 
still odd). 

37. Show that if a is negative then p 三 q (4a), p 卞 a implies (a/p) : = (a/q). 

38. Let p be an odd prime. Derive the quadratic character of 2 modulo p by verifying the 
following steps, involving the Jacobi symbol : 





Generalize the argument to show that 



a > 0, p^a. 



Chapter 6 

Quadratic Gauss Sums 


The method by which we proved the quadratic reciprocity 
in Chapter 5 is ingenious but is not easy to use in more 
general situations. We shall give a new proof in this chapter 
that is based on methods that can be used to prove higher 
reciprocity laws. In particular，we shall introduce the 
notion of a Gauss sum，which will play an important role 
in the latter part of this book. 

Section 1 introduces algebraic numbers and algebraic 
integers. The proofs are somewhat technical. The reader 
may wish to simply skim this section on a first reading. 


§1 Algebraic Numbers and Algebraic Integers 

Definition. An algebraic number is a complex number a that is a root of a 
polynomial a 0 x n 4- a l x n ~ l + a 2 x n ~ 2 + •. • + a” = 0, where a 0 , a u a 2 , • • 
a n eQ, and a 0 ^ 0. 

An algebraic integer co is a complex number that is a root of a polynomial 
x n + b x x n ~ 1 + ••• + = 0, where b u b 2 , …， b n eZ. 

Clearly every algebraic integer is an algebraic number. The converse is 
false, as we shall see. 

Proposition 6.1.1. A rational number reQ is an algebraic integer iff re Z. 

Proof. If r e Z, then r is a root of x — r = 0. Thus r is an algebraic integer. 

Suppose that reQ and that r is an algebraic integer; i.e., r satisfies an 
equation x” + b l x n ~ 1 + ••• + = 0 with b u ..., b n e Z. r = cjd, where 

c,deZ and we may assume that c and d are relatively prime. Substituting c!d 
into the equation and multiplying both sides by d n yields 

c” + + … + = 0. 

It follows that d divides c n and, since (d, c) = 1, that d\c. Again, since 
(d, c) = 1 it follows that d = ± 1, and so r = c/d is in Z. 

It follows, for example, that f is not an algebraic integer. □ 

The main results of this section are that the set of algebraic numbers forms 
a field and that the set of algebraic integers forms a ring. We need some 
preliminary work. 
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Definition. A subset F c C of the complex numbers is called a Q module if 

(a) y u y 2 ^y implies that 土 y 2 e F. 

(b) yeV and reQ implies that ry e V. 

(c) There exist elements y l5 y 2 » • • • ? 7* e ^ such that every y eV has the form 
Y!i=i with r i e Q. 

More briefly, F c= C is a Q module if it is a finite dimensional vector 
space over Q. 

If yi ， 72 , ••• ， h e C，the set of all expressions ⑽， r i ， r 2 ,... ， e ❶ 
is easily seen to be a Q module. We denote this Q module by [yy 2 ， … ， yj. 

Proposition 6.1.2. Let V = [y 1? y 2 , ..., yj, and suppose that a g C has the 
property that ay eV for all y eV. Then a is an algebraic number. 

Proof. ay f e K for i = 1, 2, ..., Thus ay f = XI =i where a i} eQ. It 
follows that 0 = Jj= l ( a u ~ where 6^ = 0 if i ^ j and d tj = 1 if 

x = j. By standard linear algebra we have that det(a fj — 〜 a) = 0. Writing 
out the determinant we see that a satisfies a polynomial of degree l with 
rational coefficients. Thus a is an algebraic number. □ 

Proposition 6.1.3. The set of algebraic numbers forms a field. 

Proof. Suppose that ol x and a 2 are algebraic numbers. We shall show that 

a x a 2 and ol 1 + a 2 are algebraic numbers. 

Suppose that a? + + r 2 (x![~ 2 + • • • + r w = 0 and that a? + 

s i a ? 1 + s 2 (X 2 ~ 2 + … + = 0, where r h Sj eQ. Let V be the Q module 

• • 

obtained by forming all Q linear combinations of the elements where 
0 < i < n and 0 < j < m. For yeVv/Q have e V and a 2 yeV (prove it). 
Thus we also have (o^ + a 2 )y e V and (a 1 a 2 )y e V. By Proposition 6.1.2 it 
follows that both + a 2 and aja 2 are algebraic numbers. 

Finally, if a is an algebraic number, not zero, we must show that a -1 is 
an algebraic number. Suppose that a 0 (x n + a x (x n ~ 1 + . • • + a” = 0， where 
the a x 6 Q. Then a n cc~ n + d + … + a 0 = 0. The result follows. 

□ 

To prove that the set of algebraic integers forma ring it is necessary only to 
alter the above proofs slightly. 

Definition. A subset c= C is called a Z module if 

(a) y u y 2 ^W implies that 土 y 2 e W. 

(b) There exist elements y 1 ,y 2 » - • • ^ such that every y 6 is of the form 

Y!i=i with b t eZ. 

Proposition 6.1.4. Let W be a Z module and suppose that coeC is such that 
coy eW for all yeW. Then co is an algebraic integer. 
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Proof. The proof proceeds exactly as in Proposition 6.1.2, except that now 
the aij e Z. The equation det(a fj — 3^0)) = 0 when written out shows that 
a) satisfies a monic equation of degree l with integer coefficients. Thus co is an 
algebraic integer. □ 

Proposition 6.1.5. The set of algebraic integers forms a ring. 

Proof. The proof follows from Proposition 6.1.4 in exactly the same way in 
which Proposition 6.1.3 follows from Proposition 6.1.2. We leave the details 
to the reader. □ 

Let Q denote the ring of algebraic integers. If co 2 , y eQ, we say that 
coj = a) 2 (y) (coi is congruent to co 2 modulo y) if co 1 — co 2 = ya with a g Q. 
This notion of congruence satisfies all the formal properties of congruence 
in Z. 

If a, b, c e Z, c # 0, then a 三 b (c) is ambiguous since it denotes congruence 
in Z and in Q. The ambiguity is only apparent, however. If a — b = coc with 
aeQ, then a is both a rational number and an algebraic integer. Thus a is an 
ordinary integer by Proposition 6.1.1. 

The following proposition will be useful. 

Proposition 6.1.6. If co l9 o) 2 eQ and pel. is a prime, then 

(cd 1 + a) 2 ) p = a>\ co p 2 (/?). 

Proof, (co 丄 + a) 2 ) p == YJ=o Lemma 2, Chapter 4, we have 

P\(D for 1 < k < p — 1. The result follows from this and the fact that Q 
is a ring. □ 

A root of unity is a solution to an equation of the form — 1 = 0. Thus 
roots of unity are algebraic integers, and so are Z linear combinations of roots 
of unity. 

We conclude this section by presenting several important properties of 
algebraic numbers. If a is an algebraic number then clearly any nonzero 
polynomial f(x) in Q[x] of smallest degree for which /(a) = 0 must be 
irreducible. 

Proposition 6.1.7. If a is an algebraic number then a is the root of a unique 
monic irreducible f(x) in Q[x]. Furthermore if g(x) e Q[x], g(cc) = 0 then 
f(x)\g(x). 

Proof. Let f(x) be any monic irreducible with /(a) = 0. We prove the 
second assertion first. If / (x) iv g(x) then (/ (x), g(x)) = 1. By Lemma 4, 
Section 2, Chapter 1 we may write / (x)h(x) + g(x)t(x) = 1 for polynomials 
h(x), t(x) e Q[x]. Putting x = a gives a contradiction. Uniqueness now fol¬ 
lows immediately. □ 
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The polynomial defined in Proposition 6.1.7 depends therefore only 
upon a. It is called the minimal polynomial of a. If the degree of the minimal 
polynomial is n, then a is called an algebraic number of degree n. If / (x) is 
irreducible of degree n, then, using the fundamental theorem of algebra and 
Exercise 16 we see that / (x) is the minimal polynomial for each of its n roots. 
If a, P are roots of f(x) then a and P are said to be conjugate. 

The set of complex numbers g(oc)/h(oc) where g(x), h(x) e Q[x], h((x) # 0 
forms a field denoted by Q(a). Denote by Q[a] the ring of polynomials in a 
with rational coefficients. Then one has the following important result. 

Proposition 6.1.8. If cl e Q then Q(a) = Q[a]. 

Proof. Clearly Q[a] c Q(a). If h(a)e Q[a], h((x) # 0, then by Proposition 
6.1.7,/(x) ^ h(x% where f(x) is the minimal polynomial of a. Thus (/ (x), 
h(x)) = 1 so that by Lemma 4, Section 2, Chapter 1, s(x) f (x) + t(x)h(x) = 1 
for elements s(x), t(x)e Q[x]. Put x = a so that t(a)/i(a) = 1. Thus h(a )~ 1 e 
Q[a]. If P e Q(a) then JS = 分 (a)/z(a) _1 for g(x), h(x) e Q[x] and the above 
shows that P e Q[a]. □ 

Corollary. If cl is an algebraic number of degree n then [Q(a): O] = n. 

Proof. By the proposition it is enough to show [Q[a] : Q] = n. Since 
/(a) = 0 it is easily seen that 1, ..., oc n ~ 1 span Q[a]. If on the other hand 
a 0 + a^cc + • •. + a n - x cL n ~ 1 = 0, a f eQ, then g(oc) = 0 for g(x) = a 0 + 
a x x + … + a n _ iX" - Then, by Proposition 6.1.7,/(x)| g(x). But deg(^f(x)) < 
deg (/ (x)) which implies that a 0 = a 1 = a 2 = = a n . x = 0. Therefore 

1, a,..., a w_1 are linearly independent over Q. □ 


§2 The Quadratic Character of 2 

Let C = e 2ni/8 . Then C is a primitive eighth root of unity. Thus 0 = C 8 — 1 = 
(C 4 — 1)(C 4 + 1). Since C 4 # 1 we have C 4 = — 1. Multiplying by C— 2 and 
then adding C 2 to both sides yields f 2 + C~ 2 = 0. This equation is also 
easily derived from the observation that C 2 = e i(n/2) = i. 

The quadratic character of 2 will now be derived from the relation 

(c + c _1 ) 2 = c 2 + 2 + r 2 = 2. 

Let t = ( + 1 and notice that C and r are algebraic integers. We may 

thus work with congruences in the ring of algebraic integers. 

Let p be an odd prime in Z and notice that 

X P ~ X = ( t 2)(P-D/2 = 2(P-”/2 三 (2/p) (p). 

It follows that x p = (2/p)z (p). By Proposition 6.1.6, = (C 4- C~ 1 ) p = 

c p + r p (p). 
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Remembering that = 1 we have C p + C~ p = C + C -1 for p = 士 1 (8) 
and C p + C _p = C 3 + C 3 forpE ±3 (8). The result in the latter case may 
be simplified by observing that = — 1 implies that C 3 = 一 C 、 Thus 
= — (C + C _1 ) if P e 土 3 (8). Summarizing, 


c p + c 


p 


T, if p E 土 1 (8 )， 

—r, if p 三 土 3 (8). 


Substituting this result into the relation x p = (2/p ) 丁 (p) yields 

(- 1) £ t = (p\ where e = - (2). 

Multiply both sides of the congruence by t. Then 


(-l) £ 2^(-)2(p), 


implying that 


(- 1 , 0 ). 


This last congruence implies that (2/p) = ( — l) e , which is the desired 
result. 

Euler (1707-1783), in an early paper, proved that 2 is a quadratic residue 
modulo primes p = l (8). His method contains the key idea of the above 
proof. 

Euler assumed that U(Z/pZ) is a cyclic group. Gauss was the first to give a 
rigorous proof of this fact (see Theorem 1, Chapter 4). Let A be a generator of 
U(Z/pZ) and set y = A (p_1)/8 . Then y has order 8, so that y 4 = —T and y 2 + 
y~ 2 = 0. Therefore, (y + y -1 ) 2 = y 2 + 2 + y -2 = 2. This shows that 2 is a 
square in U(Z/pZ), which is equivalent to 2 being a quadratic residue 
modulo p. 

If p _ 1 (8), this proof cannot get started. However, the theory of finite 
fields enables us to carry through to a complete proof of quadratic reciprocity 
using Euler’s idea. We shall develop the theory of finite fields in Chapter 7. 


§3 Quadratic Gauss Sums 

Given the relation (C + C 1 ) 2 = 2 of Section 2, one might ask if there is a 
similar relation when 2 is replaced by an odd prime p. The answer is yes, and, 
moreover, the full law of quadratic reciprocity follows from this new relation 
by using the method of Section 2. 

Throughout this section C will denote e 2ni/p , a primitive pth root of unity. 
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Lemma 1. G at is equal to p if a = 0 (p). Otherwise it is zero. 

Proof. If a = 0 (p), then = 1 ， and so C at = p- If a ^ 0 (p), then 
C fl / 1 and Xf=o C at = (C ap - l)/(C fl - 1) = 0. □ 

Corollary. p ~ 1 Yj=o C {x ~ y) = ^(x, y\ where S(x 9 y) = 1 if x = y (p) and 
取 y) = 0ifx^ y (p). 

Proof. The proof is immediate from Lemma 1. □ 

All summations for the remainder of this section will be over the range zero 
to p — 1. It will simplify notation to avoid writing out this fact each time. 

Lemma 2. ^ (t/p) = 0, where (t/p) is the Legendre symbol 

Proof. By definition (0/p) = 0. Of the remaining p — 1 terms in the sum¬ 
mation, half are +1 and half are — 1, since by Corollary 1 to Proposition 
5.1.2, there are as many quadratic residues as quadratic nonresidues mod p. 

□ 

We are now in a position to introduce the notion of Gauss sum. 
Definition. g a = (t/pX at is called a quadratic Gauss sum. 

Proposition 6.3.1. g a = (a/p)g 1 . 

Proof. If a = 0 then ^ = 1 for all t, and g a = Yu ( f /p) = 0 by Lemma 2. 
This gives the result in the case that a = 0 (p). 

Now suppose that a ^ 0 (p). Then 



We have used the fact that at runs over a complete residue system mod p 
when t does and that (x/p) and depend only on the residue class of x 
modulo p. 

Since (a/p) 2 = 1 when a ^ 0(p) our result follows by multiplying the 
equation (a/p)g a = g l on both sides by (a/p). □ 

From now on we shall denote g i by g. It follows from Proposition 6.3.1 
that g 2 a = =g 2 if a ^ 0 (p). We shall now deduce this common value. 

Proposition 6.3.2. g 2 = ( — l)( p_ 1)/2 p. 

Proof. The idea of the proof is to evaluate the sum g a g- a in two ways. 

If a # 0 (p\ then g a g— a = (a/p)(-a/p)g 2 = ( - l/p)g 2 . If follows that 

Y,9a9- a = 
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Now, notice that 


Qa9-a = 


X y 


齡 ㈣ 


Summing both sides over a and using the corollary to Lemma 1 yields 





取 } Op 



Putting these results together we obtain ( — l/p)(p — l)g 2 = (p — l)p. There¬ 
fore, g 2 = ( - l/p)p. □ 

Let p* = ( — l)( p_ 1)l2 p. The equation g 2 = p* is the desired analog of the 
equation t 2 = 2. Let q 关 p be another odd prime. We proceed to prove the 
law of quadratic reciprocity by working with congruences mod q in the ring 
of algebraic integers : 

g q ~ l = (g 2 ) iq ~ 1)/2 = p* iq ~ 1)/2 = ( 警 )(q). 

Thus 


g q 


(?) 


g(q)- 


Using Proposition 6.1.6 we see that 


G q 


潮 


q 


q 


E(-K- 


g q (q). 


It follows that g q 三 g q 三 (q/p)g (q) and so 

f 

q 






(?) 


g (q). 


Multiply both sides by g, and use g 2 = p*: 


* 


(?) 


P * ⑷， 


which implies that 


q 


r 

P 


氺 


q 


(q) 


and finally 
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To see that this result is what we want simply notice that 

The notion of quadratic Gauss sum that we have used can be considerably 
generalized. We shall present some of these generalizations after developing 
the theory of finite fields. Cubic Gauss sums will be used to prove the law 
of cubic reciprocity, and quartic Gauss sums will be used to prove biquad¬ 
ratic reciprocity. 

§4 The Sign of the Quadratic Gauss Sum* 


According to Proposition 6.3.2, the quadratic Gauss sum has value if 
p = l (4) and 土 iy/p if p = 3 (4). Thus the value of g(x) is determined up to 
sign. The determination of the sign is a much more difficult problem. The 
conjecture that the plus sign holds in each case was made by Gauss and re¬ 
corded in his diary in May 1801. It was not until four years later that he found 
a proof. On August 30， 1805 Gauss recorded in his diary that a proof the 
“very elegant theorem mentioned in 1801” had finally been achieved. He 
wrote to his friend W. Olbers on September 3, 1805 that seldom had a week 
passed for four years that he had not tried in vain to prove his conjecture. 
Finally according to Gauss “Wie der Blitz einschlagt, hat sich das Rathsel 
gelost...” (as lightning strikes was the puzzle solved). 

Subsequently proofs were found by Dirichlet, Cauchy, Kronecker, 
Mertens, Schur, and others. In this section we present one of Kronecker’s 
proofs. 

As in the previous section C = e 2ni/p . Then 1 ， C，... ， 1 are the roots of 

- 1 . 

Proposition 6.4.1. The polynomial 1 + x -f • • • + x p ~ 1 is irreducible in 

Q[x]. 

Proof. By Exercise 4 at the end of this chapter (“Gauss’ lemma”）it is enough 
to show that 1 + x + … + x p ~ 1 has no nontrivial factorization in Z[x]. 
Suppose, on the contrary, that 1 + x + x 2 -f • • • + x p_1 = / (x)g(x) where 
f(x), g(x)e Z[x] and each has degree greater than one. Putting x = 1 gives 
p = f(l)g(l). Therefore we may assume g(l) = 1. Using a bar to denote 
reduction modulo p we conclude that ^(T) # 0. On the other hand since 
pj(^Xj = 1, ..., p - 1, we have — 1 = (x — l) p (p) and division of both 
sides by x — 1 shows that 1 + x + … + x p ~ 1 = (x — l) p_ 1 (p). By 
Theorem 2, Chapter 1 and Proposition 3.3.2 it follows that g(x) = (x — l) s (p) 
for some positive integer s. However, this contradicts the fact that 沒 (T) ^ (0), 
and the proof is complete. □ 


* In this section the Gauss sum g will be denoted by g(x) with x(t) = (t/p) by definition. 
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Combining the above proposition with Proposition 6.1.7 we see that if 
g(0 = 0 for g(x)e Q[x] then 1 + x + … + x p_ 1 jg(x). This observation 
will be useful later. 


Proposition 6.4.2. Y[k P =i )/2 (C 2fc_1 — r( 2 卜 1 >) 2 = ( — l) (p_ 1)/2 p. 

Proof. One has x p — 1 = (x — 1) Y\j=i ( x ~ C j )> Divide by x — 1 and put 
x = 1 to obtain p — C r )，where the product is over any complete set 

of representative of the nonzero cosets modulo p. The integers ±(4fc — 2 )， 
k = 1 ， 2, • • • ， （p — 1)/2 are easily seen to be such a system of residues. Thus 


-(4k-2)> 


” n(mn(i - 
= n (r (2fc ” — c 2k n (c 2fc _ 1 — c 

=(_1)( p - d /2 Y[ ( c 2 卜 1 —(-以― 1 )) 2 ， 


(2k- 1 )> 


all the products being over k — 1 ， 2, ... ， （p — 1)/2. 


Proposition 6.4.3. 

(r2k-l _ y-(2k- 1 )\ — ^ P = ^ ( 4 ), 

Wp, if p = 3 (4). 

Proof. By Proposition 6.4.2 we have only to compute the sign of the product. 
The product is 


(P-D/2 

I(p-d /2 j-| 2 sin 

k= 1 


(4k — 2)71 
P 


But sin((4fe — 2)/p)n < 0 if (p + 2)/4 < k < (p — 1)/2. It follows that the 
product has (p — 1)/2 — [(p -f 2)/4] negative terms and this is easily seen 
to be (p — 1)/4 or (p — 3)/4 according as p = 1 (4) or p = 3 (4), respectively. 
The result follows immediately. □ 

By Proposition 6.3.2 and Proposition 6.4.2 we know that 

(p-l)/2 

g(x) = ^ n (c 2 H-r( 2 卜”)， ⑴ 

k= 1 


where s = 士 1. The evaluation of the Gauss sum is completed by Proposition 
6.4.3 if we can show that £ = +1. The following argument of Kronecker 
shows that this is the case. See also Exercise 22. 


Proposition 6.4.4. e = +1. 

Proof. Consider the polynomial 

P~ 1 (P~ 1)/2 

/w = z g n (^ 2k_i - x p ~ i2k ~ i} ). 


( 2 ) 
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Then/(Q = 0 by (1) and/(l) = 0 by Lemma 2. By the comment preceding 
Proposition 6.4.2 and the fact that 1 -f x + • • • + x p ~ 1 and x — 1 are rela¬ 
tively prime we conclude that x p — 11 / (x). Write f(x) = (x p — l)h(x) and 
replace x by e z to obtain 


p-l (p-1)/2 

X X(j)e jz - £ II (e (2k ~ 1)z - e zip - {2k ~ l)) ) = (e pz - l)h(e z ). (3) 

j=l k=l 

The coefficient of z (p " 1)/2 on the left-hand side of (3) is easily seen to be 


ZJ;/ xCZ ) 产一 1)/2 
((p- 1)/2)! 


(P-D/2 

—e f] (4k — p — 2). 


k 


On the other hand by Exercise 21 the coefficient of z (p ~ 1)/2 on the right-hand 
side of (3) is pA/B where p )( B, A and B being integers. Equating coefficients, 
multiplying by B((p — 1)/2)! and reducing modulo p shows that 


P~ 1 /n _ l\ (P- !)/2 

Z x(j)j (p ~ 1)/2 = n ( 4k ~ 2 )(p) 

j=l \ Z / k=l 

(P-D/2 

=£(2-4.6- -(p - 1)) n ( 2k - 0 

k= 1 

三 e(p - 1)! 

三 —e (p) 

using Wilson’s theorem (corollary to Proposition 4.1.1). 

By Proposition 5.1.2 j (p ~ 1)/2 = xO) (p) so one has 


Z x(j) 2 = (p - l)= -e(p) 

and therefore 


e = 1 (P). 

Since e = ± 1 we conclude finally that e = 1. This concludes the proof. 


The result may be stated as 


Theorem 1. The value of the quadratic Gauss sum g(x) is given by 


g(x)= 



if P ^ 1 (4), 
if p = 3 (4). 


Notes 

In the famous eleventh supplement to L. Dirichlet’s Vorlesungen ilber Zahlen- 
theorie [127] (1893) R. Dedekind introduced the concept of an algebraic 
number (§164) as well as that of an algebraic integer (§173). However the use 
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of certain algebraic integers such as Gauss sums to prove the law of quadratic 
reciprocity occurs much earlier with Eisenstein, Jacobi, and others. Among 
the various proofs of this theorem given by Gauss, the fourth (1811) and the 
sixth (1818) are of central importance. The fourth proof is a corollary to 
Gauss’ remarkable calculation of the value of the classical Gauss sum. 
While, as we mentioned in Section 6 he proved this result in 1805, it was not 
until 1811 that he published the proof in his famous paper 44 Summierung 
gewisser Reihen von besonderer Art” [34]. In this paper he shows more 
generally that if n is any positive integer then has the value ^fn 

or i^/n according as n = 1 (4) or « = 3 (4). Here C = e 2ni/n . The argument 
is quite ingenious. The proof can be found in English in Nagell [60], pp. 
174-180. It is not difficult to derive quadratic reciprocity from this result 
(see, for example, Dirichlet [125], pp. 253-256). 

The sixth and last of Gauss’ published proofs of the law of quadratic 
reciprocity was published in 1818 under the title “Neue Beweise und Er- 
weiterungen des Fundamentalsatzes in der Lehre von den Quadratischen 
Resten” [34], pp. 496-510. He mentions in the introduction to this paper that 
for years he had searched for a method that would generalize to the cubic and 
biquadratic case and that finally his untiring efforts were crowned with success 
(“… die unermudliche Arbeit wurde endlich von gliicklichem Erfolge 
gekr6nt.”). The purpose of publishing this sixth proof, he states, was to bring 
to a close that part of the higher arithmetic dealing with quadratic residues 
and to say, in a sense, farewell . und so diesem Teile der hoheren Arith- 
metik gewissermassen Lebewohl zu sagen.”）In this proof Gauss considers 
the polynomial f k (x) = [f:。 1 X ⑴产 and proves, without using roots of 
unity, that 1 + x + ... + x p ~ 1 divides f { (x) 2 — (—l) ip ~ 1)/2 p as well as 
f q (x) — Reciprocity follows by noting that f q (x) = f x {x) q (q). The 

proof we have given in Section 3 amounts to putting x = ^ p '\n the above and 
working with congruences in the ring of algebraic integers. This observation 
was made (at least) by Cauchy, Eisenstein, and Jacobi (in alphabetical order) 
and represents the stepping stone to the study of the higher reciprocity laws 
via Gauss sums. 

The beginning student will do well to study several of the classical intro¬ 
ductions to the theory of algebraic numbers. Aside from Dirichlet and 
Dedekind mentioned above, we cite E. Landau [165] and E. Hecke [44]. In 
recent times there have appeared many texts of varying levels of difficulty. 
We mention here W. Adams and L. Goldstein [84], LeVeque [180], and 
H. Pollard and H. Diamond [63]. Hecke’s book has just appeared in English 
(Algebraic Number Theory ， Springer-Verlag, 1981). 

Exercises 

1. Show that + y/z is an algebraic integer. 

2. Let a be an algebraic number. Show that there is an integer n such that ncc is an 

algebraic integer. 
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3. If a and ^ are algebraic integers, prove that any solution to x 2 + ax + )9 = 0 is an 
algebraic integer. Generalize this result. 

4. A polynomial f(x) e Z[x] is said to be primitive if the greatest common divisor of its 
coefficients is 1. Prove that the product of primitive polynomials is again primitive. 
[Hint :Let f (x) = a 0 x n + a l x n ~ 1 + … + a„andgf(x) = b 0 x m + b { x m ~ 1 -f • • • + 
be primitive. If p is a prime, let a t and bj be the coefficients with the smallest subscripts 
such that p)(a t and pJ^bj. Show that the coefficient of x i+j in / (x)g(x) is not divisible 
by /?.] This is one of the many results known as Gauss’ lemma. 


5. Let a be an algebraic integer and / (x) e Q[x] be the monic polynomial of least degree 
such that / (a) = 0. Use Exercise 4 to show that / (x) e Z[x]. 

6. Let x 2 + mx + ne Z[x] be irreducible and a be a root. Show that Q[a]= 
{r + 5a|r, 5 e Q} is a ring (in fact, it is a field). Let m 2 — 4n = D}D, where D is 

square-free. Show that Q[a] = Q[^/D]. 

7. (continuation) If D = 2, 3 (4), show that all the algebraic integers in Q[^/D] 

have the form a + b^/D, where a,beZ.l(D = \ (4)，show that all the algebraic 
integers in have the form a + b(( — \ + ^/D)/2X where a，b e Z. [Hint : Show 

that r + s^Jd satisfies x 2 — 2rx + (r 2 — Ds 2 ) = 0. Thus by Exercise 5, r + s^/d is 
an algebraic integer iff 2r and r 2 — Ds 2 are in /]• 

8. Let co = e 2ni/3 . o) satisfies x 3 — 1 = 0. Show that (2co + l) 2 = — 3 and use this to 
determine ( — 3/p) by the method of Section 2. 

9. Verify Proposition 6.3.2 explicitly for p = 3 and p — 5; i.e., write out the Gauss sum 
longhand and square. 

10. What is 

11. By evaluating ^ (1 + (t/p))C in two ways prove that g — Yjt ^' 

12. Write = ^ at . Show that 

(a) WJt) == iA«( - 0 = 

(b) (1/P) ~ s ) = ^ 

13. Let /be a function from Z to the complex numbers. Suppose that pis a prime and that 

f( n P) — /(«) for all neZ. Let f(a) — p _1 Yjt Prove that f(t) — 

Y, a f (^)iAa(O- This result is directly analogous to a result in the theory of Fourier 
series. 


A ^ 

14. In Exercise 13 take / to be the Legendre symbol and show that f(a) = p g— a . 

15. Showthat|X?=m(^/p)l < yjp log p. The inequality holds for the sum over any range. 
This remarkable inequality is associated with the names of Polya and Vinogradov. 
[Hint : Use the relation (t/p)g = g t and sum. The inequality sin x > (2/n)x for any 
acute angle x will be useful.] 


16. Let a be an algebraic number with minimal polynomial /(x). Show that f(x) does 
not have repeated roots in C. 

17. Show that the minimal polynomial for is x 3 — 2. 

18. Show that there exist algebraic numbers of arbitrarily high degree. 
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19. Find the conjugates of cos 2n/5. 

20. Let F be a subfield of C which is a finite dimensional vector space over Q of degree n. 
Show that every element of F is algebraic of degree at most n. [iVot^:That an element 
exists with degree exactly n is more difficult to prove (see Exercise 17, Chapter 12).] 

21. Let f(x) = Yjn=o a n xn l n ^ and g{x) = Y l ^=oK x：n / n ^ power series with a n and 
b n integers. If p is a prime such that p\a ( for i — 0 , …， p — \ show that each 
coefficient c t of the product f(x)g(x) = c n x，t for t = 0, ..., p — 1 may be 
written in the form p(A/B), p%B. 

22. Show that the relation e = \ (p) in Proposition 6.4.4 can also be achieved by replac¬ 
ing x by 1 + t instead of e z . 

23. If f(x) — x n + 1 + ... + a", a { e Z and p is a prime such that p\a i9 i = 

1, ...,n,p 2 ^a„ show that f(x) is irreducible over Q (Eisenstein’s irreducibility 
criterion). 



Chapter 7 

Finite Fields 


We have already met with examples of finite fields, 
namely，the fields Z/pZ，where p is a prime number. 
In this chapter we shall prove that there are many more 
finite fields and shall investigate their properties. This 
theory is beautiful and interesting in itself and ， moreover, 
is a very useful tool in number-theoretic investigations. 
As an illustration of the latter point, we shall supply yet 
another proof of the law of quadratic reciprocity. Other 
applications will come later. 

One more comment. Up to now the great majority 
of our proofs have used very few results from abstract 
algebra. Although nowhere in this book will we use very 
sophisticated results from algebra, from now on we shall 
assume that the reader has some familiarity with the 
material in a standard undergraduate course in the subject. 


§1 Basic Properties of Finite Fields 

In this section we shall discuss properties of finite fields without worrying 
about questions of existence. The construction of finite fields will be taken 
up in Section 2. 

Let F be a finite field with q elements. The multiplicative group F* of F 
has q — l elements. Thus every element aeF* satisfies the equation x q ~ l = 1 
(in this context 1 stands for the multiplicative identity of F and not the integer 
1 )， and every element in F satisfies x q = x. 

Proposition 7.1.1. 

x q — x = ]^[(x — a). 

aeF 

Proof. Both polynomials are to be considered as elements of F[x]. 

Every element a g F is a root of x q — x. Since F has q elements and since 
the degree of x 9 — x is q, the result follows. □ 

Corollary 1. Let F cz K, where K is a field. An element ae K is in F iff = a. 

Proof, a 9 = a iff a is a root of x q — x. By Proposition 7.1.1，the roots of 
x q — x are precisely the elements of F. □ 
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Corollary 2. Iff (x) divides x q — x，then f (x) has d distinct roots，where d is the 
degree off(x). 

Proof. Let f(x)g(x) = x q — x. g(x) has degree q — d. Iff(x) has fewer than 
d distinct roots, then by Lemma 1 of Chapter 4,f(x)g(x) would have fewer 
than d + (q — d) = q distinct roots, which is not the case. □ 

Theorem 1. The multiplicative group of a finite field is cyclic. 

Proof. This theorem is a generalization of Theorem 1 in Chapter 4. The proof 
is almost identical. 

If d\q — 1, then x d — 1 divides x q ~ l — 1 and it follows from Corollary 2 
that x d — l had d distinct roots. Thus the subgroup of F* consisting of ele¬ 
ments satisfying x d = l has order d. 

Let ij/(d) be the number of elements in F* of order d. Then Yjc\d <A(c) = d. 
By the Mobius inversion formula 

m = i Kc) - = m. 

c\d C 

In particular, ij/(q — 1) = 0(^ — 1) > 1, unless we are in the trivial case 
q = 2. This concludes the proof. □ 

The fact that F* is cyclic when F is finite allows us to give the following 
partial generalization of Proposition 4.2.1. 

Proposition 7.1.2. Let a e F*. Then x n — a has solutions iff a iq ~ 1)/d = 1, where 
d = (n, q — 1). If there are solutions, then there are exactly d solutions. 

Proof. Let y be a generator of F* and set oc = y a and x = y y . Then x n = a is 
equivalent to the congruence ny = a (q — 1). The result now follows by 
applying Proposition 3.3.1. □ 

It is worthwhile to examine what happens in the extreme cases n\q — 1 
and (n, g — 1) = 1. 

If n|^f — 1, then there are exactly (q — l)/n elements of F* that are nth 
powers, and if a is an nth power, then x n = cl has n solutions. 

If (n, — 1) = 1, then every element is an nth power in a unique way; 

i.e., for a 6 F*, x n = a has one and only one solution. 

We have investigated the structure of F*. Now we turn our attention to 
the additive group of F. 

Lemma 1. Let F be a finite field. The integer multiples of the identity form a 
subfield of F isomorphic to Z/pZ for some prime number p. 

Proof. To avoid confusion, let us temporarily call e the identity of F* instead 
of 1. Map Z to F by taking n to ne. This is easily seen to be a ring homo- 
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morphism. The image is a finite subring of F, and so in particular it is an 
integral domain. The kernel is a nonzero prime ideal. Therefore, the image is 
isomorphic to Z/pZ for some prime p. □ 

We shall identify Z/pZ with its image in F and think of F as a finite 
dimensional vector space over Z/pZ. Let n denote that dimension and let 
(jo u (jo 2 ,. .. ,co„bea basis. Then every element coeF can be expressed uniquely 
in the form a 1 co 1 + a 2 oj 2 + • • • + where e Z/pZ. It follows that F 
has p n elements. We have proved 

Proposition 7.1.3. The number of elements in a finite field is a power of a prime. 

If e is the identity of the finite field F, let p be the smallest integer such that 
pe = 0. We have seen that p must be a prime number. It is called the charac¬ 
teristic of F. For a 6 F we have pa = p(ea) = (pe)<x = 0 • a = 0. This observa¬ 
tion leads to the following very useful proposition. 

Proposition 7.1.4. If F has characteristic p，then (a + p) pd = a pd + p pd for all 
a, jS 6 F and all positive integers d. 

Proof. The proof is by induction on d. For d = 1, we have 

(a + P) p = aP + X (【 

All the intermediate terms vanish because p | ® for 1 < k < p — 1 by Lemma 
2 of Chapter 4. 

To pass from d to d + 1 just raise both sides of (a + p) pd = oc pd + p pd to the 
pth power. □ 

Suppose that F is a finite field of dimension n over Z/pZ. We want to find 
out which fields E lie between Z/pZ and F. If d is the dimension of E over 
Z/pZ, then it follows by general field theory that d\n. We shall give another 
proof below. It turns out that there is one and only one intermediate field 
corresponding to every divisor d of n. 

Lemma 2. Let F be afield. Then x l — 1 divides x m — 1 in F[x] iff l divides m. 
Proof. Let m = ql + r, where 0 < r < l Then we have 


x m - 

- 1 

r x ql - 

-1 

x r - 

_ 1 


-1 " 

- x l 

X - 

-1 

卜 x，- 

-1 


Since (x ql - l)/(x l - 1) = (x 1 )^ 1 + (x l ) q ~ 2 + ••• + /+ 1， the right- 
hand side of the above equation is a polynomial iff (x r — l)/(x l — 1) is a 
polynomial. This is easily seen to be the case iff r = 0. The result follows. 


oc p ~ k p k + ^ + p p . 



82 


7 Finite Fields 


Lemma 3. If a is a positive integer，then a 1 — 1 divides a m — 1 iff l divides m. 

Proof. The proof is analogous to that of Lemma 2 with the number a playing 
the role of x. We leave the details to the reader. □ 

Proposition 7.1.5. Let F be a finite field of dimension n over Z/pZ. The subfields 
of F are in one-to-one correspondence with the divisors of n. 

Proof. Suppose that £ is a subfield of F and let d be its dimension over Z/pZ. 
We shall show that d | n. 

Since E* has p d — 1 elements all satisfying x pd ~ 1 — 1 , we have that 
x pd ~ l — 1 divides x pn ~ 1 — 1. By Lemma 2, p d — 1 divides p n — 1 and con¬ 
sequently, by Lemma 3, d divides n. 

Now suppose that d\n. Let E = {oce F\oc pd = a}. We claim that £ is a 
field. For if a, j? 6 E, then 

(a) (a + p) pd = a pd + j?〆=a + f 

(b) (ap) pd = a pd p pd = aj?. 

(c) (a _1 ) pd = (a pd )~ 1 = a~ 1 for a / 0. 

In step (a) we made use of Proposition 7.1.4. 

Now E is the set of solutions to x pd — x - 0. Since d\n, we have p d - 
l\p n ~ 1 and x pd ~ l - l\x pn ~ l — 1 by Lemmas 2 and 3. Thus x pd - x divides 
x pn - jc, and by Corollary 2 to Proposition 7.1.1，it follows that E has p d 
elements and so has dimension d over Z//?Z. 

Finally, if E f is another subfield of F of dimension d over Z/pZ, then the 
elements of E f must satisfy x pd — x = 0; i.e., E must coincide with E. □ 

Let F q denote a finite field with q elements. To illustrate Proposition 7.1.5, 
consider F 4096 (we shall show in Section 2 the existence of such a field). 
Since 4096 = 2 12 we have the following lattice diagram : 


•^ 4-096 
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§2 The Existence of Finite Fields 


In Section 1 we proved that the number of elements in a finite field has the 
form p n , where p is a prime. We shall now show that given a number p n there 
exists a finite field with 〆 elements. To do this we shall need some results 
from the theory of fields that connect our problem with the existence of 
irreducible polynomials. Then we shall prove a theorem going back to 
Gauss (again!) that shows that Z/pZ[x] contains irreducible polynomials 
of every degree. 

Let k be an arbitrary field and / (x) be an irreducible polynomial in /c[x]. 
We then have 


Proposition 7.2.1. There exists a field K containing k and an element ae K such 
that /(a) = 0. 

Proof. We proved in Chapter 1 that fc[x] is a principal ideal domain. It 
follows that (/ (x)) is a maximal ideal and thus /c[x]/ (/ (x)) is a field. Let 
K f = /c[x]/(/(x)) and let </> be the homomorphism that maps /c[x] onto K' 
by taking an element to its coset modulo (/ (x)). We have the diagram 

fc[x] — ^ K' 


k ― (j>{k) 

cf)(k) is a subfield of K’. We claim that it is isomorphic to k. It is enough to 
show that (j) restricted to k is one to one. Let aekAf cj)(a) = 0, then ae(f (x)). 
If a # 0, it is a unit and cannot be an element of a proper ideal. Thus a = 0, 
as was to be shown. 

Since </) is an isomorphism of k we may identify k with When this is 
done we relabel K' as K. 

Let a be the coset of x in K. Then 0 = </>(/(x)) = /(</>(x)) = /(a); i.e. s a 
is a root of / (x) in K. [ 

We denote the field iC constructed in the proposition by /c(a). The following 
proposition about /c(a) will be useful. 

Proposition 7.2.2. The elements 1, a, a 2 , ..., a n_ 1 are a vector space basis for 
/c(a) over k, where n is the degree off (x). 

The proof of this proposition is the same as that of Proposition 6.1.8 and 
its corollary. One replaces Q by /c and the complex number a of that proposi¬ 
tion by the above a. 

To turn the matter around, the proposition shows that if we want to find 
a field extension K of k of degree n, then it is enough to produce an irreducible 
polynomial f(x) e /c[x] of degree n. 
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Ift Z/pZ^x] there are finitely many polynomials of a given degree. Let F d (x) 
be the product of the monic irreducible polynomials in Z//?Z[x] of degree d. 

Theorem 2 

x^ n -x = Y\ F d (x), 

d\n 

Proof. First notice that if f(x) divides x pn — x, then / (x) 2 does not divide 
x pn — x. This follows since if x pn — x =f (x) 2 g(x) we obtain 

一 1 = 2f(x)fXx)g(x) + f(x) 2 g\x) 

by formal differentiation. This is impossible since it implies that/ (x) divides 1. 

It remains to prove that if/(x) is a monic irreducible polynomial of degree 
d, then f(x)\x pn — x iff d|n. 

Consider K = Z/pZ(oc), where a is a root of / (x), as in Proposition 7.2.2. 
It has dimension d over Z/pZ and thus 〆 elements. The elements of K satisfy 
x pd — x = 0. 

Assume that x pn — x = f(x)g(x). Then oc pn = a. If 1 + b 2 cc d ~ 2 -f 
• • • -f is an arbitrary element of K, then 

(b l (x d ~ 1 + … + b d ) pn = bi(a pn ) d_1 + • • • + = b l cx d ~ 1 + … + 

Hence the elements of K satisfy x pn — x = 0. It follows that x pd — x divides 
x pn — x, and by Lemmas 2 and 3 of Section 1 d divides n. 

Assume now that d\n. Since a pd = a and / (x) is the monic irreducible 
polynomial for a, we have f(x)\x pd — x. Since d|n we have x pd — x\x pn — x 
again by Lemmas 2 and 3 of Section 1. Thus / (x)| x pn — x. □ 

Let N d be the number of monic irreducible polynomials of degree d in 
Z/pZ[x]. Equating the degrees on both sides of the identity in the theorem 
yields 

Corollary 1. p n = dN d . 

Corollary 2. N n = rT 1 Kn/d)p d . 

Proof. Apply the Mobius inversion formula (Theorem 2 of Chapter 2) to the 
equation in Corollary 1. □ 

Corollary 3. For each integer n > l, there exists an irreducible polynomial of 
degree n in Z/pZ[x]. 

Proof. N n — n~ 1 (p n — . • • + by Corollary 2. The term in parentheses 
cannot be zero since it is the sum of distinct powers of p with coefficients 1 
and — 1. □ 

Summarizing, we have 

Theorem 3. Let n > 1 be an integer and p be a prime. Then there exists a finite 
field with p n elements. 
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§3 An Application to Quadratic Residues 

In Chapter 6 we proved the law of quadratic reciprocity using Gauss sums 
and the elements of the theory of algebraic numbers. We shall now give an 
exceptionally short proof along the same lines using the theory of finite fields. 

Let p and q be distinct odd primes. Since (p, q) = I there is an integer 
n (for example, p — l) such that q n = 1 (p). Let Fbea finite field of dimension 
n over Z/qZ. Then F* is cyclic of order q n — 1. Let y be a generator of F* 
and set X — y iqn ~ 1)/p . Then X has order p. Define x a — [fJ。 1 (t/p)X a \ where 
a e Z. The element x a of F is an analog of the quadratic Gauss sums intro¬ 
duced in Chapter 6. Set = t. Then the proofs of Propositions 6.3.1 and 
6.3.2 can be used to show that 

⑴ = (♦)!：• 

⑵ t 2 = (-l) ip - 1)/2 p. 

In relation 2, p is the coset of p in Z/qZ. Let p* = ( —l) (p_ 1)/2 p. Then 
relation 2 can be written as t 2 = p*. This relation implies that (p*/q) = 1 
iff t g Z/qZ. By Corollary 1 to Proposition 7.1.1, this is true iff z q = t. Now, 



By relation 1 we have z q = (q/p)T. Thus = t iff (q/p) = 1. 
We have proved that 



This is the law of quadratic reciprocity. 

A proof that (2/q) = ( —1) ( ^ 2 ~ 1)/8 can be given using the same technique. 
In Chapter 6 we gave Euler’s proof that (2/q) = 1 if = 1 (8). li q ^ l (8), it 
is nevertheless true that q 2 = 1 (8). In this case one can carry through the 
proof working in a finite field F of dimension 2 over Z/qZ. We leave the details 
to the reader. 


Notes 

The first systematic account of the theory of finite fields is found in Dickson 
[25], although E. Galois had axiomatically developed a number of their 
properties much earlier in his note “Sur la theorie des nombres” [33]. As 
the existence of a finite field with p n elements is equivalent to the existence of 
an irreducible polynomial of degree n in the ring F[x] we must include Gauss 
once again as a founder. In his paper “Die Lehre von den Reste” he derives 
the formula we have given for the number of irreducibles of degree n (see 
[34]). 
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The use of finite fields to give a proof of quadratic reciprocity has been 
observed by a number of mathematicians, e.g., Hausner [43] and Holzer 
[45, pp. 76-78]. 

Our treatment of finite fields throughout this book is much more elemen¬ 
tary than is usual in modern times. Most treatments first develop the full 
Galois theory of fields and apply the general results of that theory to the 
special case of finite fields. This is done in A. Albert’s compact book [1]. 
The advantage of Albert’s book for those readers already familiar with the 
theory of fields is that he discusses finite fields extensively in his last chapter 
and provides a very long bibliography on the subject. Many interesting 
references are provided. 


Exercises 

1. Use the method of Theorem 1 to show that a finite subgroup of the multiplicative 
group of a field is cyclic. 

2. Let R and C be the real and complex numbers, respectively. Find the finite subgroups 
of R* and C* and show directly that they are cyclic. 

3. Let F be a field with q elements and suppose that q 三 1 (n). Show that for oceF* 
the equation x" = a has either no solutions or n solutions. 

4. (continuation) Show that the set of a € F* such that x" = a is solvable is a sub¬ 
group with (q — l)/n elements. 

5. (continuation) Let X be a field containing F such that [X : F] = n. For all oleF* 
show that the equation x” = a has n solutions in K. [Hint: Show that q n — \ is 
divisible by n(q — 1) and use the fact that a 9-1 = 1.] 

6. LetK id F be finite fields with [K : F] = 3. Show that if a e F is not a square in F, it is 
not a square in K. 

7. Generalize Exericse 6 by showing that if a is not a square in F, it is not a square in 
any extension of odd degree and is a square in every extension of even degree. 

8. In a field with 2" elements what is the subgroup of squares? 

9. If K 3 F are finite fields, = aeF, ^f=l (n), and x" = a is not solvable in F, 
show that x n = a is not solvable in K if (m, [X : FJ) = 1. 

10. Let K 3 F be finite fields and [K: F] = 2. For p e K show that p 1+q eF and more¬ 
over that every element in F is of the form p 1+q fov some peK. 

11. With the situation being that of Exercise 10 suppose that oleF has order q — 1. Show 
that there is a )9 g K with order q 2 — \ such that p 1 +q = a. 

12. Use Proposition 7.2.1 to show that given a field k and a polynomial / (x)e /c[x] there 
is a field K k such that [K : k] is finite and f(x) = (x — a x )(x 一 a 2 ) … (x — aj 
in K[x]. 

13. Apply Exercise 12 to /c = Z/pZ and f{x) — x pn — x to obtain another proof of 
Theorem 2. 
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14. Let F be a field with q elements and n a positive integer. Show that there exist 
irreducible polynomials in F[x] of degree n. 

15. Let x" — 1 g F[x], where F is a finite field with q elements. Suppose that (q, n) = 1. 
Show that x" — 1 splits into linear factors in some extension field and that the least 
degree of such a field is the smallest integer / such that q f 三 1 (n). 

16. Calculate the monic irreducible polynomials of degree 4 in Z/2Z[x]. 

17. Let q and p be distinct odd primes. Show that the number of monic irreducibles of 
degree q in Z/pZ[x] is 1 (p q — p). 

18. Let p be a prime with p = 3 (4). Show that the residue classes modulo p in Z[i] forma 
field with p 2 elements. 

19. Let F be a finite field with q elements. If / (x) e F[x] has degree t, put \f\ = q l . Verify 
the formal identity |/「 s = (1 — 1 ~ s )~ 1 • The sum is over all monic polynomials. 

20. With the notation of Exercise 19 let d(f) be the number of monic divisors of / and 
<f) = : Yjg\s 1 沒 I ， where the sum is over the monic divisors of /. Verify the following 
identities: 

⑻ M/)i/r = (i - w 

(b) ^fC7(f)\fr = (1 - q^r 1 ^ - q 2 ' 5 )- 1 - 

21. Let F be a field with q = p n elements. For cteF set f(x) = (x — a)(x — a p ) x 
(x — a p2 ) • • • (x — a pn_1 ). Show that / (x) e Z/pZ\_x]. In particular, a + a p + ... + a p ” _ 1 
and ococV 2 - - - a pn l are in Z/pZ. 

22. (continuation) Set tr(a) = a + a p + • • • + OL pn ~\ Prove that 

(a) tr(a) + tr(^) = tr(a -f P). 

(b) tr(aa) = a tr(a) for a e Z/pZ. 

(c) There is an a e F such that tr(a) ^ 0. 

23. (continuation) For aeF consider the polynomial x p — x — aeF[x]. Show that 
this polynomial is either irreducible or the product of linear factors. Prove that the 
latter alternative holds iff tr(a) = 0. 

24. Suppose that /(x)eZ/pZ[x] has the property that f(x -\- y) — f(x) H- 
f{y) g Z/pZ[x, y~\. Show that f(x) must be of the form a 0 x H- H- a 2 x p2 4- 
… + a m Y m . 



Chapter 8 

Gauss and Jacobi Sums 


In Chapter 6 we introduced the notion of a quadratic 
Gauss sum. In this chapter a more general notion of 
Gauss sum will be introduced. These sums have many 
applications. They will be used in Chapter 9 as a tool 
in the proofs of the laws of cubic and biquadratic reci¬ 
procity. Here we shall consider the problem of counting 
the number of solutions of equations with coefficients in a 
finite field. In this connection，the notion of a Jacobi sum 
arises in a natural way. Jacobi sums are interesting in their 
own right，and we shall develop some of their properties. 

To keep matters as simple as possible, we shall confine 
our attention to the finite field ZjpZ = F p and come back 
later to the question of associating Gauss sums with an 
arbitrary finite field. 


§1 Multiplicative Characters 

A multiplicative character on F p is a map x from F* to the nonzero complex 
numbers that satisfies 

l{ab) = l{a)x{b) for all a,beF*. 

The Legendre symbol, (a/p), is an example of such a character if it is 
regarded as a function of the coset of a modulo p. 

Another example is the trivial multiplicative character defined by the 
relation s(a) = 1 for all as F*. 

It is often useful to extend to domain of definition of a multiplicative 
character to all of F p . If x ^ s, we do this by defining x(0) = 0. For e we 
define s(0) = 1. The usefulness of these definitions will soon become ap¬ 
parent. 

Proposition 8.1.1. Let 乂 be a multiplicative character and ae Fp. Then 
⑻ X ⑴ =1. 

(b) x( a ) is a (p — l)st root of unity. 

(c) = xi^y 1 = 丽 
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[In part (a) the 1 on the left-hand side is the unit of F p , whereas the 1 on 
the right-hand side is the complex number 1. The bar in part (c) is complex 
conjugation.'] 

Proof. x(l) = X(1.1) = X ⑴ X(l). Thus ^(1) = 1, since x(l) ^ 0. 

Toprovepart(b), noticethata p_1 = 1 implies that 1 = ^(1) = x( aP ~ 1 ) = 

To prove part (c), notice that 1 = ^(1) = ^(a -1 a) = z(a _1 )Z(a). This 
shows that ^(a -1 ) = %(a) _1 . The fact that xi a )~ l = x( a ) follows from the 
fact that /(a) is a complex number of absolute value 1 by part (b). □ 

Proposition 8.1.2. Let xbea multiplicative character. If x £, then ^ ^(0 = 0, 

where the sum is over all t e F p . If x = the value of the sum is p. 

Proof. The last assertion is obvious, so we may assume that ^ In this 
case there is an a e F* such that x(a) # 1. Let T : =L X ⑴. Then 

X(a)T = X x{a)x{t) = X! z(^0 = T. 

t t 

The last equality follows since at runs over all elements of F p as t does. 
Since x( a )T = T and x( a ) 參 1 it follows that T = 0. □ 

The multiplicative characters form a group by means of the following 
definitions. (We shall drop the use of the word multiplicative for the re¬ 
mainder of this chapter.) 

(1) If x and X are characters, then yX is the map that takes ae F*to ^(a)A(a). 

(2) If x is a character, x 1 is the map that takes ae F* to x(a) _1 . 

We leave it to the reader to verify that and x 1 are characters and 
that these definitions make the set of characters into a group. The identity 
of this group is, of course, the trivial character s. 

Proposition 8.1.3. The group of characters is a cyclic group of order p — 1. 
If a e F* and a ^ 1, then there is a character x such that ^(a) ^ 1. 

Proof. We know that F* is cyclic (see Theorem 1 of Chapter 4). Let g e F* 
be a generator. Then every ae F* is equal to a power of g.lf a = g l and x 
is a character, then x(a) = jig) 1 . This shows that x is completely determined 
by the value x(d)- Since x(d) is a (p — l)st root of unity, and since there are 
exactly p — l of these, it follows that the character group has order at most 

p — 1. 

Now define a function A by the equation ^,(g k ) = ^ 27tI(fc/(p_1)) . It is easy 
to check that X is well defined and is a character. We claim that p — 1 is the 
smallest integer n such that k n = s. IfA w = e, then k\g) = s(g) = 1. However, 
X n (g) = k{g) n = e 2ni{n,(p ~ 1)) . It follows that p — I divides n. Since A p ~ i (a) = 
A(a) p_1 = A(a p_1 ) = A(l) = 1 we have X p ~ 1 = e. We have established that 
the characters e, A, A 2 ,, X p ~ 2 are all distinct. Since by the first part of the 
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proof there are at most p — 1 characters, we now have that there are exactly 
p — 1 characters and that the group is cyclic with A as a generator. 

If a e F* and a ^ 1, then a = g l with p — 1)(1 Let us compute X(a). 
A(a) = X{g) 1 = e 2 rti(//(p-i)) ^ This concludes the proof. □ 

Corollary. // a e F* and a # 1, then X/ x( a ) = 0, where the summation is 
over all characters. 

Proof. Let S = ^ x x( a )- Since a ^ l there is, by the theorem, a character A 
such that A(a) ^ 1. Then 

= X 义⑷ X ( a ) = X 々(“）= 5 . 

X X 

The final equality holds since Xx runs over all characters as x does. It follows 
that (A(a) — 1)S = 0 and thus S = 0. □ 

Characters are useful in the study of equations. To illustrate this, con¬ 
sider the equation x n = a for a e F*. By Proposition 4.2.1 we know that 
solutions exist iff a ip ~ 1)/d = 1, where d = (n, p — 1), and that if a solution 
exists, then there are exactly d solutions. For simplicity, we shall assume that 
n divides p — 1. In this case d = (n 9 p — 1) = n. 

We shall now derive a criterion for the solution ofx n = a using characters. 

Proposition 8.1.4. If a e F*,n\p — 1, and x n = a is not solvable, then there is a 
character x such that 

⑻ / = £• 

(b) x{a) ^ 1. 

Proof. Let g and X be as in Proposition 8.1.3 and set x = A( p_1)/ ”. Then 
x(g) = A (p_1)/ %) = X(g) ip ~ 1)/n = e 2ni/n . Now a = g l for some Z, and since 
x n = a is not solvable, we must have n)(l Then xi a ) = X(d) 1 = e 2ni{lln) ^ 1. 
Finally, f = X p ~ x = e. □ 

For a e F p , let N(x n = a) denote the number of solutions of the equation 
x n = a. If n\p — 1, we have 

Proposition 8.1.5. N(x n = a) = x( a ) ^here the sum is over all characters 

of order dividing n. 

Proof. We claim first that there are exactly n characters of order dividing n. 
Since the value of x(g) for such a character must be an nth root of unity, there 
are at most n such characters. In Proposition 8.1.4, we found a character 
X such that x(d) — e 2ni/n . It follows that e, ^ 2 , ..., x n 1 are n distinct 
characters of order dividing n. 

To prove the formula, notice that = 0 has one solution, namely, 
x = 0. Now /(O) = 1, since e(0) = 1 and ^(0) = 0 for ^ ^ e. 
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Now suppose that a ^ 0 and that x n = a is solvable; i.e., there is an 
element b such that b n = a. If x n = ^ then xi a ) — X(^ n ) = Xi^T = X n (b)= 
s(b) = 1. Thus x( a ) = ^ which is N(x n = a) in this case. 

Finally, suppose that a ^ 0 and that x n = a is not solvable. We must 
show that Yux n =i Z( a ) = 0. Call the sum T. By Proposition 8.1.4, there is a 
character p such that p(a) ^ 1 and p n = e. A simple calculation shows that 
p(a)T = T (one uses the obvious fact that the characters of order dividing n 
form a group). Thus (p(a) — 1)T = 0 and T = 0, as required. □ 

As a special case, suppose that p is odd and that n = 2. Then the theorem 
says that N(x 2 = a) = 1 -f- (a/p), where (a/p) is the Legendre symbol. This 
equation is easy to check directly. 

In Section 3 we shall return to equations over the field F p . 


§2 Gauss Sums 

In Chapter 6 we introduced quadratic Gauss sums. The following definition 
generalizes that notion. 

Definition. Let / be a character on F p and a g F p . Set g a (x) = Z(0C flt » where 
the sum is over all t in F p , and C = e 2ni/p . g a (x) is called a Gauss sum on F p 
belonging to the character 

Proposition 8.2.1. If a / 0 and / 参 e，we have g a (x) = x(ah(x). If a ^ 0 
and x = e vve have g a (e) = 0. If a = 0 and x # e，we have g 0 (x) = 0. If a = 0 
and x = e, w have g 0 (s) = p. 

Proof. Suppose that a ^ 0 and that x # e . Then 

x(a)g a (x) = x(a) Z x(t)C at = Z = Giixl 

t t 

This proves the first assertion. 

If a 0, then 

心⑻ = I ： 咖 ) r = E r = o. 

t t 

We have used Lemma 1 of Chapter 6. 

To finish the proof notice that g 0 (x) = X(0C Ot = z(0- If Z = 

the result is p;if x ^ e, the result is zero by Proposition 8.1.2. 匚 

From now on we shall denote gi(x) by g(x). We wish to determine the 
absolute value of g(j). This can be done fairly easily by imitating the proof 
of Proposition 6.3.2. 
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Proposition 8.2.2. If x ^ then \g(x)\ = y/p- 

Proof. The idea is to evaluate the sum g a (x)Ga(x) i n two ways. 

If a 会 0, then by Proposition 8.2.1, g a (x) = X(a 一 = X ⑷分 (X) and 

Gail) = Thus g a (x)gjjj = x(^~ = \g(x)\ 2 - Since 

ffo(X) = 0 our sum has the value (p — i)\g(x)\ 2 - 
On the other hand, 

ga(xW = I J T J x(xW)r- a ^ 

x y 

Summing both sides over a and using the corollary to Lemma 1 of 
Chapter 6 yields 

E da(x)gjx) = X Z Z(^)zOO^(x, y)p = (p - l)p. 

a x y 

Thus (p — l)\g(x)\ 2 = (p — and the result follows. □ 

The relation of the above result to Proposition 6.3.2 is made clearer by the 
following considerations. 

What is the relation between g(x) and g(x) (x the character that takes a 
to /(a); i.e., it coincide with the character x -1 )? 

g(x) = = x(_i) 以 od. 

t t 

We have used the fact that /( —1) = /(—1), which is obvious since /( —1)= 
土 1. Thus the fact that \g(x)\ 2 = P can be written as g(x)g(x) = z( —l)p. 
If x is the Legendre symbol, this relation is precisely the result in Proposition 
6.3.2. 


§3 Jacobi Sums 


Consider the equation x 2 y 2 = \ over the field F p . Since F p is finite, 
the equation has only finitely many solutions. Let N(x 2 + y 2 = 1) be that 
number. We would like to determine this value explicitly. 

Notice that 


N(x 2 + / = 1) = X N(x 2 = a)N(y 2 = b\ 

a + b=l 

where the sum is over all pairs a,b e F p such that a b = 1. Since N(x 2 = a) 


(a/p), we obtain by substitution that 


a 


^ 2 + / = 1) = p + ? r_ +? „ 
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The first two sums are zero, so we are left with the task of evaluating the 
last sum. We shall see shortly that its value is —( — l) (p ~ 1)/2 . Thus 
N(x 2 -h y 2 = 1) is p — 1 if p = 1 (4) and p + 1 if p = 3 (4). The reader is 
invited to check this result numerically for the first few primes. 

Let us go a step further and try to evaluate N(x 3 + y 3 = 1). As before 
we have 


N(x 3 + y 3 = 1)= E N(x 3 = a)N(y 3 = b). 

a + b= 1 

If p = 2 (3), then N(x 3 = a) = 1 for all a since (3, p — 1) = 1. It follows 
that N(x 3 + y 3 = 1) = p in this case. Assume now that p = 1 (3). Let 
X e be a character of order 3. Then y 2 is a character of order 3 and y 2 ^ e. 
Thus e, x, and x 2 are all the characters of order 3, henceforth called cubic 
characters. By Proposition 8.1.5 we have N(x 3 = a) = 1 4 - x( a ) + X 2 ( a \ 
Thus 

+ / = 1) = [ [ Z‘(a) W ⑻ 

a + b= 1 i = 0 j' = 0 

=ZZ f E /(a)r 7 ⑻). 

i j \a + b=l / 

The inner sums are similar to the sum that occurred in the analysis of 
N(x 2 + / = 1). 

Definition. Let x and X be characters of F p and set J(x, A) a + b—l x ⑷雄 ) . 

J(X, A) is called a Jacobi sum. 

To complete the analysis of N(x 2 + y 2 = l) and N(x 3 + y 3 = 1) we 
need to obtain information on the value of Jacobi sums. The following 
theorem not only supplies this information, but shows as well a surprising 
connection between Jacobi sums and Gauss sums. 


Theorem 1. Let x and 入 be nontrivial characters. Then 

(a) J(e, e) = p. 

(b) J(e, x) = 0. 

(c) J(X, Z _1 ) = 

(d) // W 參 e ，then 


A) 


g(x)gW 


Proof. Part (a) is immediate, and part (b) is an immediate consequence of 
Proposition 8.1.2. 
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To prove part (c), notice that 

Z x(a)x~\b) = X x(v) = 

a + b= 1 a + b=l / a 竽 1 一 

b^O 

Set a/(l — a) = c. If c 丰 — 1， then a = c/(l + c). It follows that as a 
varies over F p , less the element 1， that c varies over F p , less the element — 1. 
Thus 

J(lx~ 1 )= T,x(c)= -z(-i)- 

— 1 

To prove part (d), notice that 

她 ( 乂 ) = ([xwr)(zw) 

=Z x(x)Ky)C x+y 

x,y 

=S ( E wmooV. ⑴ 

t \x+y=t / 

If t = 0, then X^+y=o li^)Ky) = Zx = 乂(一 1) Lc lK^) = 0, 

since / 又 / e by assumption. 

If t 参 0， define x' and 〆 by x = Dc’ and y = ty\ If x + y = t, then 
X' + 〆 =1. It follows that 

E X(x)Ky) = Z X(tx f )A(ty f ) = X 又 (t)J(X ， 又 ). 

x + y = t x’+y’ = l 

Substituting into Equation (1) yields 

g(x)gW = Z xKt)J(x, ^)C = 又 MxA. □ 

t 

Corollary. If x, K and 乂又 are not equal to e, then | J(x ， A) | = ^/p. 

Proof. Take the absolute value of both sides of the equation in part (d) and 
use Proposition 8.2.2. □ 

We now return to the analysis of N(x 2 + 〆 =1) and N(x 3 + y 3 = 1). 
In the former case, it was necessary to evaluate the sum J]a+b=i (a/p) x 
(b/p). Case (c) of Theorem 1 is applicable and gives the result 一 (一 1/p)= 
—(—l) (p_ 1)/2 , as was stated earlier. 

In the case of N(x 3 + y 3 = 1) we had to evaluate the sums 
Y,a+b=i xX^xKb), where / is a cubic character. Applying the theorem leads 
to the result 


N(x 3 + y 3 = l) = p - x(-l) - Z 2 (-1) + J(X, Z) + J(x\ Z 2 ). 
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Since —1 = (—l) 3 we have ^( — 1) = ^ 3 ( —1) = 1. Also notice that 
X 2 = X _1 = L Thus 

N(x 3 + y 3 = 1) = p - 2 + 2 Re J(x, y). 

This result is not as nice as the result for N(x 2 + y 2 = 1), since we do 
not know J(x, x) explicitly. Nevertheless, by the corollary to Theorem 1 

we know that | J(x, x) I — \Jv so we have the estimate 

\N(x 3 + y 3 = 1) - /? + 2| < 2^/p. 

If we write N p for the number of solutions to x 3 + y 3 = 1 in the field 
F p , then the estimate says that N p is approximately equal to p — 2 with 

an “error term” 2y/p. This shows that for large primes p there are always 
many solutions. 

If p = 1 (3)，there are always at least six solutions since x 3 = 1 and 
y 3 = 1 have three solutions each and we can write 1+0=1 and 0+1 = 1. 
For p = 1 and 13 these are the only solutions. For p = 19 other solutions 
exist; e.g., 3 3 + 10 3 = 1 (19). These “nontrivial” solutions exist for all 

primes p > 19 since it follows from the estimate that N p > p — 2 — 2^/p > 6 
for p > 19. 

Using Jacobi sums we can easily extend our analysis to equations of the 
form ax n + b/ 1 = 1， but we shall not go more deeply into this matter now. 

The corollary to Theorem 1 has two immediate consequences of con¬ 
siderable interest. 

Proposition 8.3.1. // p = 1 (4), then there exist integers a and b such that 
a 2 b 2 = p. 

Ifp = l (3), then there exist integers a and b such that a 2 _ ab + b 2 = p. 

Proof. If p = 1 (4), there is a character x of order 4 (if A has order p — 1, let 

X = 义 (p_ 1)/4 ). The values of z are in the set {1, — 1，/， —/}, where i = y/—l. 
Thus J(x ， X) = Es+r=l x(s)x(t) e Z[i] (see Chapter 1, Section 4). It follows 
that J(x ， x) = « + bi, where a, b eZ; thus p = \J(x ， x)\ 2 = « 2 + b 2 . 

If p = l (3), there is a character x of order 3. The values of x are in the 

set {1, co, co 2 }, where co = e 2ni/3 = (—1 + ^ — 3)/2. Thus J(x ， x) e ^[⑴]. 
As above, we have J(x ， x) = a bco, where a, b eZ and p = | J(^, y)\ 2 = 
\a bco\ 2 = a 2 — ab b 2 . 匚 

The fact that primes p = 1 (4) can be written as the sum of two squares 
was discovered by Fermat. It is not hard to prove that ifa,b>0,a is odd and 
b is even, then the representation p = a 2 + b 2 is unique. 

If p = 1 (3), the representation p = a 2 — ab b 2 is not unique even if we 
assume that a,b > 0. This can be seen from the equations 

a 2 — ab + b 2 = (b — a) 2 — (b — a)b + b 2 = a 1 — a(a — b) + (a — b) 2 . 

However, we can reformulate things so that the result is unique. If p = a 2 — 
ab + b 2 , then 4p = (2a — b) 2 -f- 3b 2 = (2b — a) 2 + 3a 2 = (a 4- b) 2 4 - 3(a — b) 2 . 
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We claim that 3 divides either a, b, or a — b. Suppose that Z)(a and that 
3 氺 fc. If a = 1 (3) and b = 2 (3), ox a = 2 (3) and b = 1 (3), then a 2 — ab 
b 2 = 0 (3 )， which implies that 3|p, a contradiction. Thus 3\a — b, and we 
have 

Proposition 8.3.2. Ifp=l (3), then there are integers A and B such that 
4p = A 2 -21B 1 . In this representation of 4p, A and B are uniquely determined 
up to sign. 

Proof. The proof of the uniqueness is left to the Exercises. □ 

Theorem 1 together with a simple argument leads to a further interesting 
relation between Gauss sums and Jacobi sums. 

Proposition 8.3.3. Suppose that p ^ l (n) and that x ^ a character of 
order n > 2. Then 

9(xT = Z (— X)J(Z ， X 2 ) … J(x, 

Proof. Using part (d) of Theorem 1 we have g(x) 2 = J(x, x)d(x 2 )- Multiply 
both sides by g(x) and we get g(xf = J(x, x)J{x, X 2 )d(x 3 \ Continuing in this 
way shows that 

Gixf _1 =J(x, XV fc X 2 ) * * • x n ~ 2 )^(/ _1 ). (2) 

Now z w_1 = Z' 1 = l Thus, as we have seen, g(x)g(x n ~ 1 ) = g(x)d(x )= 
X( — l)p. The result follows upon multiplying both sides of Equation (2) 
by g(x\ □ 

Corollary. If x is a cubic character, then 

g(x) 3 = pJ(x, 

Proof. This is simply a special case of the proposition and the fact that 

z(-i) = z((-i) 3 ) = 1. □ 

Using this corollary, we are in a position to analyze more fully the complex 
number J(x, x) that occurred in the discussion of N(x 3 + y 3 = 1). We 
have seen that J(x, X) = a + bco, where a,beZ and co = e 2m/3 = 

(-1 + >/—3)/2. 

Proposition 8.3.4. Suppose that p = 1 (3) and that % is a cubic character. Set 
J(X, x) = a ba> as above. Then 

(a) = 0 (3). 

(b) a = -1 (3). 

Proof. We shall work with congruences in the ring of algebraic integers as in 
Chapter 6: 

W) 3 = (P ⑴ C) 、 p(0W3). 
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Since ^(0) = 0 and x(t) 3 = 1 for f ^ 0 we have x(t) 3 C 3t = C 3r 
=—1. Thus 

g(x) 3 = x) = a^bco= -1 (3). 

Working with x instead of x and remembering that g(j) = g(x) we find that 

g(x) 3 = PJ(X. l) = a + bo5= -1 (3). 

Subtracting yields b(co — co) = 0 (3), or b^/ — 3 = 0 (3). Thus — 3b 2 = 
0 (9) and it follows that 31 b. Since 31 b and a + bco = — 1 (3), we must have 
a = —1 (3), which completes the proof. □ 

CoroUary. Let A = 2a — b and B = b/3. Then >1 = 1 (3) and 

4p = A 2 + 21B 1 . 

Proof. Since J(x ， x) = a + bco and | J(/ 9 x)\ 2 = P we have p = a 2 — ab b 2 . 
Thus 4p = (2a — b) 2 + 3ft 2 and 4p = A 2 TIB 1 , 

By Proposition 8.3.4,31 b and a = —1 (3). Therefore, A = 2a — b = l (3). 

□ 

We are now ready to prove the following beautiful theorem due to Gauss. 

Theorem 2. Suppose that p = 1 (3). Then there are integers A and B such that 
4p = A 2 + 21B 2 . If we require that A = l (3), A is uniquely determined, 
and 

N(x 3 + y 3 = 1) = p — 2 - A. 

Proof. We have already shown that N(x 3 + y 3 = 1) = p — 2 + 2 Re J(x, x\ 
Since J(x, ^) = a + fcco as above, we have Re J(x, x) = (2a — b)/2. Thus 
2 Re J(x ， x) = 2a — b = A = l (3). Uniqueness is left as an exercise. □ 

Let us illustrate this result with two examples, p = 61 and p = 67. 

4.61 = l 2 + 27.3 2 . Thus the number of solutions to x 3 + y 3 = 1 in F 61 
is 61 — 2 + 1 = 60. 

Now, 4.67 = 5 2 + 27 - 3 2 . We must be careful here; since 5 关 1 (3) 
we must choose A = —5. The answer is thus 67 — 2 — 5 = 60, which by 
coincidence (?) is the same as for p = 61. 


§4 The Equation x n + y n = 1 in F p 

We shall assume that p = 1 (n) and investigate the number of solutions to 
the equation + y” = 1 over the field F p . The methods of Section 3 are 
directly applicable. 
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N{x n + y n =\)= ^ N(x n = a)N(y n = b\ 

a+b=l 

Let % be a character of order n. By Proposition 8.1.5 

N(x n = a) = X /⑷ • 

i = 0 

Combining these results yields 

N(x n + / = 1) = I' W、/•)• 

j = 0 i = 0 

Theorem 1 can be used to estimate this sum. When i = 7 = 0 we have 
J(X 0 , Z 0 )= 取 e ) = P. When j + i = n ， x j = (r 7 ) 一 1 so that J(x j , /)= 
~X j (— 1)- The sum of these terms is -^J:} x j (— !)• Notice that Y,j=o X J (~^) 
is n when — 1 is an nth power and zero otherwise. Thus the contribution of 
these terms is 1 — d„(—l)n 9 where 5 W ( — 1) has the obvious meaning. Finally, 
if i = 0 and 7 / 0 or i / 0 and j = 0, then J W) = 0. Thus 

N(x n + / = 1) = p + 1 - l)n + X J( X \ X j y 

i，j 

The sum is over indices i and j between 1 and n — 1 subject to the con¬ 
dition that / + 7 7 ^ n. There are (n — l ) 2 — (n — l) = (n — l)(n — 2 ) such 

terms and they all have absolute value y/p. Thus 

Proposition 8.4.1. 

\N(x n + / = 1) + S n (-l)n - (p + 1)| <{n- l)(n - 2)^p. 

The term d„(—l)n will be interpreted later as the number of points “at 
infinity” on the curve x n y n = 1 . 

For large p the above estimate shows the existence of many nontrivial 
solutions. 


§5 More on Jacobi Sums 


Theorem 1 can be generalized in a very fruitful manner. First we need a 
definition. 

Definition. Let Zi ， Z 2 ， . • • ， Zz be characters on F p . A Jacobi sum is defined by 
the formula 

办 1 ， h ， • • • ， h) = E Xi(ti)X2(t 2 ) - - - 

11 + ••• + = 1 
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Notice that when l = 2 this reduces to our former definition of Jacobi 
sum. 

It is useful to define another sum, which will be left unnamed: 

JoiXu ••• ， h) = [ Xi(ti)X2(t 2 ) - - - 

“ + …+ = 0 


Proposition 8.5.1. 

(a) J 0 (s, 8, • • • ， e)= =«/(£， £，•••，£) = pf 1. 

(b) If some but not all of the Xi ^ re trivial，then — ^(Xv 

= 0. 

(c) Assume that Xi 妾 s. Then 


^o(.Xi 9 Xi ? • • •» Xi) 


0 ， f Xih ••• Xi¥^, 

Xi( 一 1)(P 一 1V(Xi ， X 2 , … ， X/ - 1 )， otherwise. 


Proof. If t l9 t 2 , • • • ， are chosen (arbitrarily) in F p9 then t t is uniquely 

determined by the condition t l -\-t 2 -\ - h ( 卜 ！ + q = 0. Thus J 0 (e,e)= 

p i_1 . Similarly for J(s, 

To prove part (b), assume that li 、 … 、 1% are nontrivial and that 
Xs+i = Xs +2 = * * * = Zi = «. Then 

Z Xi(ti)X2(h) - - - Xi(t t ) 

ti +... +o = o 

=Z Xi(Ox2 ⑹ … /s(0 

丈1，丈 2， h •，丈 l - 1 

=Xi(0)(Z X2(h)j … (Z Xs(t s )j = 0. 

We have used Proposition 8.1.2. Thus J 0 (xi, X 2 ,... ， h) = 0. Similarly for 
• • • ? Xi)' 

To prove part (c), notice that 

S \ti + ••• + <!-! = —s / 

Since i _ s ， l( 0) = 0, so we may assume that s / 0 in the above sum. 
If s / 0, define t\ by t t = —st[. Then 

t\ H - l-fj _ i = —s 

= XiX2"Xi-i(-s) Z 

/{ + ••• + /{ 一 1 = 1 

= XiX2 - - - Xi-i(-s)J(Xu … ， Xi-i)- 

Combining these results yields 

X 1 X 2 -' Xi(s\ 

s^0 
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The main result follows since the sum is zero if i\li * * * X/ ^ e and p — 1 
ifXiXi-'Xi = □ 

Parts (a) and (b) of Proposition 8.5.1 generalize parts (a) and (b) of 
Theorem 1. Part (d) of Theorem 1 can be generalized as follows. 

Theorem 3. Assume that Xu X 2 ^ ^ Xr are nontrivial and also that X 1 X 2 ' * Xr 15 
nontrivial Then 

g(Xi)Q(X2) - - - 9(Xr) = h，• • • ， XrMXiXi - - - Xrl 

Proof. Let F p — Cbe defined by \j/(t) = C- Then + t 2 ) = 少 (h 神 (h )， 
and g(x) = X! Z(0<A(0* The introduction of \j/ is for notational convenience. 

g(Xi)9(X2) - - - g(Xr) 

=([ … (Z Xr(t r )il/(t r )j 
=E ( Z Xl(h)X2(t 2 ) - - - 

s \ti + + ••• + = 5 / 

If s = 0, then by part (c) of Proposition 8.5.1 and the assumption that 

Z = 0- 

f i + … +f r = = 0 

If s / 0, the substitution = st\ shows that 

Z XlGl) ... Xritr) = Xlll - - - Xr(s)J(Xu 义 2 ， … ， Xr). 

ti + … + f r = s 

Putting these remarks together, we have 

9(Xl) - - - g(Xr) = J(Xl ， h，• • • ， Xr) [ X 1 X 2 - - - Xr(S)lKs) 

s^O 

=J(Xl ， h ， . . . ， XMX 1 X 2 - - - Xr\ □ 

Corollary 1. Suppose that Xu are nontrivial and that X 1 X 2 • • • Xr is 

trivial. Then 

分 (Xlk(X2) • • • g(Xr) = Xr( - ^)pHXl ， h ， . • . ， Xr- 1 ). 

Proof. g(xi)g(Xi) - - - 9(Xr- 1 ) = J(Xi ， … ， 1 ) by Theorem 3. 
Multiply both sides by g(x r )- Since X 1 X 2 * * • Zr-i = x7 l we have 

d(xi … It- i)g(Xr) = gix7 l )giXr) = ^•(一 i)p. □ 


Corollary 2. Let the hypotheses be as in Corollary 1. Then 

J ( Xl ， …，厶） =— Xr (-1)处1，/2，...，厶-1). 
[If r = 2, we set J(x x ) = 1.] 
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Proof. If r = 2, this is the assertion of part (c) of Theorem 1. 

Suppose that r > 2. In the proof of Theorem 3 use the hypothesis that 
X 1 X 2 * • * Zr = £• This yields 

9 (xi)g(xi) - - - g(Xr) = ^o(xu h ， • • • ， 1) + JOa，• • • ， 1) E 少 ⑻. 

s 关 0 

Since = 0, the sum in the formula is equal to —1. By part (c) 

of Proposition 8.5.1, we have J 0 (x u • • • ， Xr) = Xr(~^)(P - 1V(Xi ， … ， Xr- 1 ). 
By Corollary 1 ， gf(zi) * * g(Xr) = Xr( - Putting these 

results together proves the corollary. □ 

Theorem 4 . Assume that Zi ， Z2, • •. ， L* are nontrivial. 

(a) If X1X2 … L + h then 

|J(Zi ， h ， ... ， l)l = P( r — V/2 . 

(b) If XiX2- Xr = ^ then 

|Jo(h ， Z2 ， ... ， l)l = (P ~ 1) 产卜 1 
and 

|J(Zi ， h ， … ，厶 )1 = P (r/2 卜 1 . 

Proof. If x is nontrivial, | g(x)\ = y/P- Part (a) follows directly from Theorem 3. 

Part (b) follows similarly from part (c) of Proposition 8.5.1 and from 
Corollary 2 to Theorem 3. □ 


§6 Applications 

Earlier in this chapter we investigated the number of solutions of the equation 
x 2 + y 2 = 1 in the field F p . It is natural to ask the same question about the 
equation xj + X 2 + • • • + = 1. The answer can easily be found using the 

results of Section 5. 

Let x be a character of order 2 (x(a) = (a/p) in our earlier notation). 
Then N(x 2 = a) = 1 4 - x( a )- Thus 

N(xj + ••• + x? = 1) = [ N(xf = a^Nixl = a 2 ) ••- N(x^ — a r ), 

where the sum is over all r-tuples(a 1? ..., a r ) such that a x + a 2 + • • • + a r = 1. 
Multiplying out, and using Proposition 8.5.1, yields 

N(xi + ... + x r 2 = 1) = p r_1 + J(x, X ， … ， X). 

If r is odd, f and if r is even, x r = £• 

Suppose that r is odd. Then Theorem 3 applies and we have J(x ， • • • ， x)= 
gixX' 1 - Since g(x) 2 = /(~l)p it follows that J(x, … ， X) = X( - l) (r_ 1)/2 p (r ~ 1)/2 . 
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Ifris even, we use Corollary 2 to Theorem 3 and find that J(x 9 Z，•. • ， Z)= 
—Z( — l) r/2 p (r_2>/2 . Finally, remember that /( —1) = ( — l) (p ~ 1)/2 . Thus 


Proposition 8.6.1. If r is odd，then 

N(xi + X! + … + X r 2 = 1) = p r ~ l + (—l)((r-”/2K(P-”/2> p( r-lV2. 

If r is even, then 

N(xf + X! + ••• + X r 2 = 1) = p r_1 — (-iyr/ 2 )((p-D/ 2 ) p (r/ 2 )-l 


The most general equation that can be treated by these methods has 
the form + a 2 x 1 ^ + • • • + a r x l ; — b 9 where a l9 ... ,a r , b e F p9 and 
h ， h， …， K are positive integers. We shall return to this subject in Section 7. 
For now, we shall use Jacobi sums to give yet another proof of the law of 
quadratic reciprocity. 

Let q be an odd prime not equal to p, and x the character of order 2 on F p . 
Then by Corollary 1 to Theorem 3 

g(x ) q+1 = (-i) (p " 1)； Vte x ， … ， x )， 


where there are q components in the Jacobi sum. 

Since ^ + 1 isevengf(x) €+1 = (g(x ) 2 ) iq+1)/2 — ( — l) ((p_ 1)/2)((4+1)/2) . p {q ^ 1)/2 . 
Substituting into the formula we find that 


Now, J(x 9 5 Z) = Z xitMti) - - - X(t q X where the sum is over all 
(ti, t 2 , … ， q) with q + … + = 1. If t = t x = t 2 — 


t q9 then 


t = l/q, and the corresponding term of the sum has value — Xio) q = 

X(q). If not all the t t are equal, then there are g different 分 -tuples obtained from 
(t l9 t 2 ,.^ 9 t q ) by cyclic permutation. The corresponding terms of the sum 
all have the same value. Thus 


(一 1)“ P — ”/ 2 1 )/ 2 )p(q- 1)/2 ^ ^(^f) (q). 
Since x(q) = (q/p) and p iq ~ 1)/2 = (p/q) (q) we have 


(_l)((P-D/2)((«-D/2 j 


P 


>) 


and thus 


(_ l)((P-D/2)((«-l)/2) 




§7 A General Theorem 

All the equations we have considered up to now are special cases of 

a x x[ l + a 2 X 2 2 + • • • + a r x l ; = b, (3) 
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where a l5 a 2 , a r9 e F* and b e F p . Let N be the number of solutions. Our 
object is to give a formula for N and an estimate for N. The methods to be 
used are identical with those already developed in the previous sections. 

To begin with, we have 

N = y Z N ( x i = u i ) n ( x 2 = u 2 )-'N(x l r r = u r \ (4) 

where the sum is over all r-tuples (u l9 w 2 ,. •. ， u r ) such that = b. 

We shall assume that l l9 Z 2 , … ， l r are divisors of p — 1, although this is 
not necessary (see the Exercises). Let Xi vary over the characters of order 
dividing l t . Then 

= u i) = X 

Xi 

Substituting into Equation (4) we get 

= E E Xl(Ul)X2(u 2 )'-XrM (5) 

Xi ， X2, . ..Xr [flit/i =i> 

The inner sum is closely related to the Jacobi sums that we have con¬ 
sidered. 

It is necessary to treat the cases = 0 and b ^ 0 separately. 

If b = 0, let ti = aiUi. Then the inner sum becomes 

l )h{a2 0 … Xr(a ~ 1 Vo(h, h ， … ， l). 

If h 0, let ti = b~ 1 a i u i . The inner sum becomes 

X 1 X 2 - - - Xr(b)Xi(ai 丄 ) … xXa ； h ， … ， L*). 

In both cases, if Xi = X 2 — ' ' — Xr — the term has the value p r — 1 

since J 0 (e,..., e) = J(e, e,..., e) = p r ~ 1 . If some but not all the Xt are 

equal to e, therr the term has the value zero. In the first case the value is zero 
unless X 1 X 2 • U e . All this is a consequence of Proposition 8.5.1. 

Putting this together with Theorem 4 we obtain 

Theorem 5. // fe = 0, then 

^ = P T ~ l + E 0 … Zr(«r' X Vo(Xu X 2 , ... ， Xr). 

The sum is over all r-tuples of characters Xu h 1” ^here — 

Xi ^ efor i = 1,..., r, and X 1 X 2 •.. X r = s . If M is the number of such r-tuples, 
then 

\N - p r_1 | < M(p - l)p (r/ 2 卜 1 . 

If b 妾 0, then 

W 1 + … Xr(b)Xl(dI 0 ••- Ir^r 

The summation is over all r-tuples of characters • • •, Zr ，^here x l i — e 
and Xi 7 ^ s for i = 1 ， … ， r. If M 0 is the number of such r-tuples with X 1 X 2 * * * Z r 
=e, and M 1 is the number of such r-tuples with X 1 X 2 • • • Zr •乒 e ，then 

\N - /7 r_1 | < M 0 p (r/ 2 卜 1 + 
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An immediate consequence of Theorem 5 is worth noting. Let a x , a 2 ,..., a r 
and b e Z and consider the congruence 

+ a 2 X 2 2 + • • • H- a r x l r K = b (p). 

Then if p is sufficiently large, the congruence has many solutions. In 
fact, the number of solutions tends to infinity as p is taken larger and larger. 

Notes 

The inspiration for this chapter is the famous paper of A. Weil [80]. The 
basic relationship between Gauss sums, also known as Lagrange resolvents, 
and Jacobi sums was known to Gauss [34] (unpublished), Jacobi [47], 
Eisenstein [27], and Cauchy. Complete proofs of the fundamental relations 
given in Proposition 8.3.3 and Theorem 1 were published by Eisenstein in 
his paper 44 Beitrage zur Kreistheilung” in 1844. Eisenstein also introduced 
generalized Jacobi sums (Section 5) to obtain a proof of the law of biquadratic 
reciprocity (see Chapter 9). 

Aside from its usefulness in obtaining the Weil-Riemann hypothesis for 
certain hypersurfaces over finite fields (see Chapter 11), the generalized 
Jacobi sum is of importance in the theory of cyclotomy and difference sets. 
For an introduction to this material, see Storer [74]. See also the difficult 
but important continuation of [80] by Weil [81]. 

Material on Gauss and Jacobi sums is scattered throughout the treatise 
of Hasse [41]. He gives a systematic presentation in his last chapter where in 
addition to developing many interesting results he shows how both types of 
sum arise naturally in the theory of cyclotomic number fields. Much of the 
theory in that chapter is distilled from the paper of Davenport and Hasse 
[23]. The latter paper is well worth close study, but it is unfortunately of an 
advanced nature and is probably inaccessible to a beginner. Somewhat less 
difficult are the more recent papers of K. Yamamoto [82] and A. Yokoyama 
[83]. One should also consult the classical treatise of P. Bachman [5]. 

More recently B. C. Berndt and R. J. Evans have studied Gauss, Jacobi, 
and other classical character sums attached to characters of order 6, 8,12,24. 
For their interesting results and extensive bibliography the reader should 
consult [92] and [95]. See also Leonard and Williams [177]. 

Theorem 2 is proved by Gauss in §358 of Disquisitiones Arithmeticae. He 
does not really state the theorem explicitly. It comes out as a by-product of 
another investigation. What he does, in fact, is to use the theorem to help find 
the algebraic equation satisfied by certain Gauss sums. We have done the 
reverse, using the theory of Gauss sums to derive the theorem. Gauss 
derived other results of this type in his first memoir on biquadratic reciprocity 
[34]. For further historical remarks about this subject, see the introduction 
to the paper of Weil [80]. 

The estimates given in Theorem 5 are derived in the first chapter of 
Borevich and Shafarevich [9]. They use a somewhat different method which 



Exercises 


105 


we have outlined in the Exercises. In the special case of quadratic forms, i.e., 
when all the / = 2, the result goes back at least to Dickson [25]. 

The technique of counting solutions by means of characters lends itself 
naturally to the problem of finding sequences of integers of prescribed length 
having prescribed kth power character modulo p. This problem is dealt with 
to some extent in Hasse [41]. In an interesting, and elementary paper, 
Davenport [21] shows that the number of sequences of four successive 
quadratic residues between 1 and p satisfies the inequality | R — p/8| < Kp 3IA \ 
where X is a constant independent of p. Better estimates can be obtained 
using the results of Weil. For another paper along the same lines, see Graham 
[36]. 

One final remark on Theorem 5. It is due originally to Weil and inde¬ 
pendently (and almost simultaneously) to L. K. Hua and H. S. Vandiver 
(Proc. Nat. Acad. Sci. U.S.A., 35 (1949) ， 94-99). With a few simplifications 
and addenda we have essentially followed Weil’s presentation. 

Exercises 

1. Let p be a prime and d = (m, p — 1). Prove that N(x m = a) = ^ x( a X the sum being 
over all x such that x d = £• 

2. With the notation of Exercise 1 show that N(x m ~ a) = N(x d = a) and conclude 
that if d t — (m ( , p — 1), then ^ a ( x m, = b and a,x d, = b have the same number 
of solutions. 

3. Let x be a nontrivial multiplicative character of F p and p be the character of order 2. 

Show that X, Z(1 — P) = P). LHint: Evaluate J(x, p) using the relation 

N(x 2 : f a) = 1 + p ⑷ .] 

4. Show, if /c g F p , /c # 0, that - t)) = x(k 2 /2 2 )J(x,p). 

5. If x 2 ^ £> show that g{y) 2 — z(2) _2 J(x ， p)g(x 2 )- [Hint: Write out g(x) 2 explicitly 
and use Exercise 4] 

6. (continuation) Show that J(x, x) = z(2)~ 2 J(x, p)- 

7. Suppose that p = 1 (4) and that / is a character of order 4. Then x 2 = P and J(x, x) — 

x( — p). IHint: Evaluate g(x) 4 in two ways.] 

8. Generalize Exercise 3 in the following way. Suppose that pis a prime, /(I — t m )= 

J(y, X), where X varies over all characters such that X m = e. Conclude that 

9. Suppose that /? = 1 (3) and that / is a character of order 3. Prove (using Exercise 5) 
that g(x) 3 = pn, where n = /(2)J(/, p). 

10. (continuation) Show that xP is a character of order 6 and that g(xp) 6 = 

(-l) (p -" /2 p 元 4 . 

11. Use Gauss’ theorem to find the number of solutions tox 3 + y 3 = l in F for p = 13, 
19, 37, and 97. 
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12. If /? = 1 (4)，then we have seen that p = a 2 + b 2 with a,beZ. If we require that a 
and b be positive, that a be odd, and that b be even, show that a and b are uniquely 
determined. (Hint : Use the fact that unique factorization holds in Z[i] and that if 
p = a 2 + b 2 then a H- bi is a prime in Z[i].) 

13. If p = 1 (3), we have seen that 4p = A 2 21B 2 with A y B e Z. If we require that 
A = l (3), show that A is uniquely determined. (Hint: Use the fact that unique 
factorization holds in Z[co]. This proof is a little trickier than that for Exercise 12.) 

14. Suppose that p = l (n) and that / is a character of order n. Show that g{j) n ^ Z[C], 
where C = e 2ni/n . 

15. Suppose that p = 1 ( 6 ) and let x and p be characters of order 3 and 2, respectively. 

Show that the number of solutions to y 2 = x 3 + D in F is p + 7 i + 元 ， where 
n = p). If 7 ( 2 ) = 1， show that the number of solutions to y 2 = x 3 + 1 

is p + A, where 4p = A 2 + 21B 2 and A = \ (3). Verify this result numerically 
when p = 31. 

16. Suppose that p 三 1 (4) and that ^ is a character of order 4. Let N be the number of 
solutions to x 4 4- y 4 = 1 in F p . Show that N = p + 1 — <5 4 ( —1)4 + 2 Re J(x, x) + 
4 Re J(x, p). 

17. (continuation) By Exercise 7, J(/, x) = X(~ p). Let n = — J(x, p). Show that 

(a) N = p — 3 — 6 Rqu if p = 1 (8). 

(b) N = p -\- 1 —2Re7cifp = 5 (8). 

18. (continuation) Let n — a bi. One can show (see Chapter 11， Section 5) that a is 
odd, b is even, and a = 1 ⑷ if 4|6 and a = — 1 (4) if 4Jfb. Let p = A 2 + B 2 and 
fix A by requiring that A = 1 (4). Then show that 

(a) N = p — 3 — 6A if p = 1 (8). 

(b) N = p + \ +2/4ifp = 5 (8). 

19. Find a formula for the number of solutions to xf + + • • • + = 0 in F p . 

20. Generalize Proposition 8.6.1 by finding an explicit formula for the number of 

solutions to a t xl + + • * • 4 - a r x 》=1 in F p . 

21. Suppose that p = 1 (d\ C = e 2ni/p ， and consider C flxd . Show that C flxd = 
[ r m(rX ar , where m(r) = N(x d = r). 

22. (continuation) Prove that C axd = gjj), where the sum is over all x such 
that = e, z ^ e. Assume that p)(a. 

23. Let / (x l5 x 2 ,..., x„) g F p [x 1? x 2 ,...,x n ]. Let N be the number of zeros of / 

in F p . Show that N = p n ~ l + p~ { C° /(Xl, *" ,Xn) )- 

24. (continuation) Let f(x l , x 2 ,...,x„) = a 1 x7 1 4 - a 2 Xj 2 + … + a n x„ n . Let d t = 

(nii, p - 1). Show that N = 1 + p -1 11?= i Zxi d aai (XiX where 心 runs over 

all characters such that = £ and Xi ^ £ - 

25. Deduce from Exercise 24 that \N — p n ~ 1 1 < (p — 1) 队 — 1) • • • (d„ — l)p (n/2)_1 . 

26. Let p be a prime, p = 1 (4)，/ a multiplicative character of order 4 on F p , and p the 
Legendre symbol. Put J(x ， p) = a bi. Show 

(a) N(y 2 + x 4 = l) = p — 1 + 2a. 

(b) N(y 2 = 1 - /) = p + X P(1 — x 4 ). 



Exercises 


107 


(c) 2a = -(- l) (p_1)/4 ( 2 r) (p) where m = (p - 1)/4. 

(d) Verify (c) for p = 13,17, 29. 

27. Let p = 1 (3), y a character of order 3, p the Legendre symbol. Show 

(a) N(y 2 = 1 - x 3 ) = p + 乙 p(l - jc 3 ). 

(b) N(y 2 + x 3 = 1) = p + 2 Re J(x, p). 

(c) 2a - b = (p) where J(x, p) = a + bw. 

28. Let p = 3 (4) and x the quadratic character defined on Z/pZ. Show 
⑻ xx(x) = 2 Y!x=i )l2 xx(x) - p Y!x=i )/2 xW- 

(b) Yx= \ = 4/(2) Y}x=\ )l2 ^x(x) - px(2) Y}x=i )l2 XW- 

(c) lfp = 3(8) then Yf x =i xx(x)/p = i Y}x = i )!1 zM- 

(d) If p e 7 ⑻ then xx(x)/p = Y!x = i )/2 x(x\ 



Chapter 9 

Cubic and Biquadratic 
Reciprocity 


In Chapter 5 we saw that the law of quadratic reciprocity 
provided the answer to the question. For which primes p 
is the congruence x 2 = a (p) solvable? Here a is a fixed 
integer. If the same question is considered for congru¬ 
ences x n = a (p) 9 n a fixed positive integer，we are led into 
the realm of the higher reciprocity laws. When n = 3 and 
4 we speak of cubic and biquadratic reciprocity. 

In the introduction to his famous pair of papers, 
“Theorie der biquadratischen Reste I, II” [34], Gauss 
claims that the theory of quadratic residues had been 
brought to such a state of perfection that nothing more 
could be wished. On the other hand，“The theory of 
cubic and biquadratic residues is by far more difficult.” 
He had only been able to deal with certain special cases 
for which the proofs had been so difficult that he soon 
came to the realization that . the previously accepted 
principles of arithmetic are in no way sufficient for the 
foundations of a general theory, that rather such a theory 
necessarily demands that to a certain extent the domain 
of higher arithmetic needs to be endlessly enlarged 
In modern language, he is calling for the establishment 
of a theory of algebraic numbers. As a first step, because 
this is what is needed for discussing biquadratic residues, 

he investigated in detail the arithmetic of the ring 
which we now refer to as the ring of Gaussian integers. 

Curiously，although Gauss formulated and discovered 
the law of biquadratic reciprocity, he did not prove it 
completely. The first complete published proofs of cubic 
and biquadratic reciprocity are due to G. Eisenstein. 

In this chapter we shall formulate and prove the laws 
of cubic and biquadratic reciprocity. We shall give two 
proofs to the law of cubic reciprocity. The first is due to 
Eisenstein and is similar in every way to the proof of the 
law of quadratic reciprocity given in Chapter 6. The second 
proof uses Jacobi sums and is analogous to the proof of 
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quadratic reciprocity given in Chapter 8, Section 6. Our 
proof of biquadratic reciprocity is also due to Eisenstein. 

In Section 10 we establish a “rational” reciprocity law 
for biquadratic residues. This elegant result，discovered 
by K. Burde in 1969 answers the following problem. If 
/? = 1 (4) and q = \ (A) are primes and p is a fourth 
power modulo q give necessary and sufficient conditions 
that q is a fourth power modulo p. 

In Section 11 we establish, with the use of Jacobi sums, 
Gauss’ criterion for the constructibility of a regular polygon. 

The chapter concludes with a short discussion of 
Rummer's problem concerning the distribution of cubic 
Gauss sums. 


§1 The Ring I[co] 

Let a) = (_1 + — 3)/2. The ring Z[co] was defined and discussed in 

Chapter 1, Section 4. Its elements are complex numbers of the form a -f bco, 
a, b e Z. If cc = a + bo e Z[co], define the norm of a, iVa, by the formula 
N(x = (xdi = a 2 — ab ^ b 2 . Here a means the complex conjugate of a. 
In Chapter 1 we used the notation A(a) instead of Nol, The change is merely 
a matter of conforming to standard notation. For notational convenience 
we shall set D : 

We have proved earlier that D is a unique factorization domain. Our 
first task here is to discover the units and the prime elements in D. 

Proposition 9.1.1. cce D is a unit iff N(x = 1. The units in D are 1, — 1, co, 
— co, co 2 , and —co\ 

Proof. If N(x = 1, aa = 1, which implies that a is a unit since a e D. 

If a is a unit, there is a j8 e D such that aj8 = 1. Thus NccN/i = 1. Since Ncc 
and NP are positive integers this implies that Ncc = 1. 

Now suppose that (x = a + bco is a unit. Then 1 = a 2 — ab + b 2 or 
4 = (2a — b) 2 + 3b 2 . There are two possibilities: 

(a) 2a — b = 土 1， = ±1. 

(b) 2a — b = 士 2, fc = 0. 

Solving these six pairs of equations yields the result 1 ， — 1, co, 一 co, 
— 1 — co and 1 + co. Since co 2 + co + 1 = 0 the last two elements are co 2 
and —co 2 . We are done. □ 


To investigate primes in D it is important to realize that primes in Z 
need not be prime in D. For example, 7 = (3 + co)(2 — co). For this reason 
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we shall speak of primes in Z as rational primes and refer to primes in D 
simply as primes. 

Proposition 9.1.2. If n is a prime in D, then there is a rational prime p such that 
Nn = p or p 2 . In the former case n is not associate to a rational prime; in the 
latter case n is associate to p. 

Proof. We have Nn = « > 1 ， or 丌元 = n. n is a product of rational primes. 
Thus 711 p for some rational prime p. If p = ny 9 y e D, then NnNy = Np = p 2 . 
Thus either Nn — p 2 and Ny = 1 or Nn = p. In the former case y is a unit 
and therefore n is associate to p. In the latter case if n = uq 9 u 3. unit and q a 
rational prime, then p = Nn = NuNq = q 2 , which is nonsense. Thus n 
is not associate to a rational prime. □ 

Proposition 9.1.3. If n e D is such that Nn = p，a rational prime, then n is a 
prime in D. 

Proof. If n were not prime in D, then we could write n = py with Np, 
Ny > 1. Then p = Nn = NpNy, which cannot be true since p is prime in Z. 
Thus 7T is a prime in D. □ 

The following result classifies primes in D. 

Proposition 9.1.4. Suppose that p and q are rational primes. If q = 2 (3 )， then 
q is prime in D. If p = l (3 )， then p = n 元 ， where n is prime in D. Finally 
3 = — co 2 (l — of) 2 , and l — co is prime in D. 


Proof. Suppose that p were not a prime. Then p = ny, with Nn > l, Ny > 1. 
Thus p 2 = NnNy and Nn = p. Let n = a + box Then p = a 2 — ab + b 2 
or 4p = (2a — b) 2 + 3b 2 , yielding p = (2a — b) 2 (3). If 3 氺 p we have 
p = 1 (3) for 1 is the only nonzero square mod 3. It follows immediately that 
if q = 2 (3)，it is a prime in D. 

Now, suppose that p = l (3). By quadratic reciprocity we have 



昏 


(— 1)(P_ D/2 D/2)((3 - 1)/2) 




Hence, there is an <2 e Z such that a 2 = — 3 (p) ox pb — a 2 + for some 

b g ~E-. Thus p divides (a + yj 一 3)(cz 一 yj — 3) = (a + 1 + 2co) x (a — 1 — 2.co), 
If p were a prime in D, it would have to divide one of the factors but this 
cannot happen since p ^ 2 and 2/p $ Z. Thus p = ny with n and y nonunits. 
Taking norms we see that p 2 = NnNy and that p — Nn = nn. 

The last case is handled as follows; x 3 — \ = (x — l)(x — co)(x — co 2 ) 
implies that x 2 + x + 1 = (x — co)(x — co 2 ). Setting x = 1 yields 3 = 



§2 Residue Class Rings 


111 


(1 — co)(l — co 2 ) = (1 + co)(l — ft>) 2 = —co 2 (l — co) 2 . Taking norms we see 
that 9 = N(1 — co) 2 and so 3 = N(l — co). Thus 1 — co is a prime. □ 

As a matter of notation q will be a positive rational prime congruent to 2 
modulo 3 and n a complex prime whose norm, Nn = p, is a rational prime 
congruent to 1 modulo 3. Occasionally n will refer to an arbitrary prime of D. 
The context should make the usage clear. 


§2 Residue Class Rings 

Just as in the ring Z and in the ring of all algebraic integers, the notion 
of congruence is extremely useful in D. If a, j?, y e D and y ^ 0 is a nonunit, 
we say that a = j8 (y) if y divides a — j8. Just as in Z the congruence classes 
modulo y may be made into a ring D/yD, called the residue class ring modulo y. 

Proposition 9.2.1. Let ne D be a prime. Then D/nD is a finite field with Nn 
elements. 

Proof. We first show that D/nD is a field. Let a e Z)be such that a _ 0 ( 兀 ). By 
Corollary 1 to Proposition 1.3.2 there exist elements p, y e D such that 
jga -f 77r = 1. Thus jSa = 1 (n), which shows that the residue class of a 
is a unit in D/nD. 

To show that D/nD has Nn elements we must consider separately the 
cases in Proposition 9.1.4. 

Suppose that 7r = ^ is a rational prime congruent to 2 modulo 3. We 
claim that {a + bco\0 < a < q and 0 < < <?} is a complete set of coset 

representatives. This will show that D/qD has q 2 = Nq elements. Let \i — 
m nco e D. Then m = qs + a and n = qt + b, where s, t, a, b e Z and 
0 < a, b < q. Clearly ju + bco (q). Next, suppose that a + bco = a f -\- 
b f o (q\ where 0 < a, b, a\ V < q. Then ((a — a!)/q) + ((b — b f )/q)(o e D, 
implying that (a — a')/q and (b — b')/q are in Z. This is possible only if 
a = a and b — b'. 

Now suppose that p = 1 (3) is a rational prime and nn = Nn = p. 
We claim that {0, 1, ...,p — 1} is a complete set of coset representatives. 
This will show that D/nD has p = Nn elements. Let n = a + bco. Since 
p = a 2 — ab + b 2 it follows that pjfb. Let // = w + nco. There is an integer 
c such that cb = n (p). Then fi — cn = m — ca(p) and so = m — ca (n). 
Every element of D is congruent to a rational integer modulo 7i. If / e Z, 
/ = sp + r, where s, r e Z and 0 < r < p. Thus l = r (p) and a fortiori 
l 三 r (n). We have shown that every element of D is congruent to an element 
of {0, 1 ， 2, ... ， p — 1} modulo n.lfr = r' (n) with r,r e Z and 0 < r,r f < p, 
then r — r f = ny and (r — r') 2 = pNy, implying that p|r — 〆. Thus r — r f 
and we are done. 

We leave the case of the prime 1 一 co as an exercise. □ 
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§3 Cubic Residue Character 


Let 7 i be a prime. Then the multiplicative group of D/nD has order Nk — 1. 
Hence we have an analog of Fermat’s Little Theorem. 

Proposition 9.3.1. If njfoi, then 

a N7r_1 = 1 (tt). 

If the norm of n is different from 3, then the residue classes of 1, co, and 
co 2 are distinct in D/nD. To see this, suppose, for example, that co = 1 (n). 
Then 7 i|(l — co), and since 1 — to is prime, n and 1 — co are associate. 
Thus Nn = N(\ co) = 3, a contradiction. The other cases are handled 
in the same way. 

Since {1, co, a) 2 } is a cyclic group of order 3 it follows that 3 divides the 
order of (D/nD)* ; i.e., 3|iV7r — 1. This can be seen in another way using 
Proposition 9.1.3. If n = q 9 a, rational prime, then Nn = q 2 = 1 (3). If n is 
such that Nn = p, then p = 1 (3). 

Proposition 9.3.2. Suppose that n is a prime such that Nn ^ 3 and that njfcc. 
Then there is a unique integer m = 0,1, or 2 such that (x (Nn ~ 1)/3 = of 1 (7r). 

Proof. We know that n divides (x Nn ~ 1 一 1. Now, 

^Nn-l _ j = ( a (Nn-l)/3 — ” ( a _-1)/3 — D/3 — 〆)• 

Since n is prime it must divide one of the three factors on the right. By 
the preceding remarks it can divide at most one factor, since if it divided two 
factors it would divide the difference. This proves the proposition. □ 

On the basis of this result we can make the following definition. 

Definition. If Nn ^ 3, the cubic residue character of a modulo n is given by 

(a) (oc/7r) 3 = 0 if 7i I a. 

(b) a (N7r_1)/3 = (cn/n) 3 (n), with (cc/n) 3 equal to 1, co, or co 2 . 

This character plays the same role in the theory of cubic residues as the 
Legendre symbol plays in the theory of quadratic residues. 

Proposition 9.3.3. 

(a) (cc/n) 3 = 1 iff x 3 = (x (n) is solvable, i.e., iff a is a cubic residue. 

(b) (x iNn ~ 1)/3 = (a/7r) 3 (tt). 

(c) (aj8/7i) 3 = (a/7r) 3 (j8/7r) 3 . 

(d) If cc = P (7r), then ((x/n) 3 = (P/n ) 3 . 

Proof. Part (a) is a special case of Proposition 7.1.2. Take F = D/nD,q — Nn, 
and n = 3 in that proposition. 
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Part (b) is immediate from the definition. 

Part (c): (ctP/n) 3 = (aj 8) (Nrc_1)/3 = a (N,c_ 1 )/ 3 j 8 (N,r_1)/3 = ((x./n) 3 (P/n) 3 (n). 
The result follows. 

Part (d): If a = j? ( 7 r), then (oc/n) 3 = oc (Nlz ~ 1)/3 = j ? (N7C_1)/3 = (j^/n) 3 (n), 
and so (oc/n) 3 = (P/n) 3 . □ 

Since we shall be dealing only with cubic characters in this section the 
notation x„(^) = (a/ 丌 )3 will be convenient. 

It is useful to study the behavior of characters under complex conjugation. 

Proposition 9.3.4. 

⑻ XM = Xn(^) 2 = X“a 2 ). 

(b) XM = h ⑹. 

Proof. 

(a) Moc) is by definition 1 , co, or co 2 , and each of these numbers squared is 
equal to its conjugate. 

( b ) fn - ■三 & ⑷⑽ 

we get 

史 Nn-1)/3 三^ ⑹ ( 元 ) • 

Since Nn = Nn this shows that Xa(^) = Xn(°0 ( 孖 ） and thus that x^(a)= 

XM- □ 

Corollary. x q (a) = Xqi^ 2 ) and x q (n) = 1 if n is a rational integer prime to q. 

Proof. Since q = q 'wq have x q (^) = Xq(^) — Zg( a ) = Xqi^ 2 )- This gives the 

first relation. _ 

Since w = n we have x q ( n ) = X q ( n ) = X q ( n ) 2 - Since x q ( n ) ^ 0 it follows 
that x q (n) = 1 . □ 

The corollary states that m is a cubic residue modulo q. Thus, if ^ q 2 
are two primes congruent to 2 modulo 3, then we have (trivially) XqMi) ^ 
X q2 ((h)- This is a special case of the law of cubic reciprocity. To formulate 
the general law we need to introduce the idea of a “primary” prime. 

Definition. If 丌 is a prime in D, we say that n is primary if 71 = 2 (3). 

If n = q is rational, this is nothing new. If n = a + bco is 3 , complex 
prime, the definition is equivalent to a = 2 (3) and = 0 (3). 

We need a notion such as “primary” to eliminate the ambiguity caused 
by the fact that every nonzero element of D has six associates. 


Proposition 9.3.5. Suppose that Nn = p = 1 (3). Among the associates of n 
exactly one is primary. 
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a + bco. 

— b + (a — b)oj. 

(b 一 Cl) — CUD. 

一 a — box 
b + (b — a)a>. 

(d 一 b) CICO. 


Since p = a 2 — ab + b 2 , not both a and b are divisible by 3. By looking 
at parts (a) and (b) it is clear that we can assume that 3 氺 a. Considering 
parts (a) and (d) we can assume further that a = 2 (3). Under this assumption 
p = a 2 — ab + b 2 leads to 1 = 4 — 2Z> + fc 2 (3) or b(b — 2) = 0 (3). If 
3\b, then a + bco is primary. If b = 2 (3)，then b + (b — a)co is primary. 

To show uniqueness, assume that a + bco is primary. By considering the 
congruence class of the first term in part (b) to part (e) we see that none of 
these expressions is primary. Neither is the expression in part (f) since the 
coefficient of a), a, is not divisible by 3. □ 

For example, 3 + co is prime since N(3 + co) = 7, and — co 2 (3 + co)= 
2 + 3a> is the primary prime associated to it. 

We can now state 

Theorem 1 (The Law of Cubic Reciprocity). Let n 1 and n 2 be primary, 

Nn 2 # 3, and ^ Nn 2 . Then 

= h 2 ( 丌 1). 

A proof will be given in Section 4, but first a few remarks are in order. 

(a) There are three cases to consider. Namely, both n 1 and n 2 are rational, 
7r r is rational and n 2 is complex, and both n 1 and n 2 are complex. The 
first case is, as we have seen, trivial. 

(b) The cubic character of the units can be dealt with as follows. Since 
—1 = ( — l) 3 we have x n ( —1)=1 for all primes n. 

In Nn 3, then it follows from Proposition 9.3.3, part (b), that 
Xni 0 ^) = co {Nn ~ i),3 - Thus zJco) = 1， co，or co 2 according to whether 
Nn = 1，4, or 7 modulo 9. 

(c) The prime 1 — co causes particular difficulty. If Nn ^ 3, we would like 
to evaluate x n {\ — co). This is done by Eisenstein in [29] by a highly 
ingenious argument. An elegant proof due to K. Williams is given in the 
Exercises. 

Theorem r (Supplement to the Cubic Reciprocity Law). ^ 3. 

If n = q is rational, write q = 3m — If n = a + bco is a primary complex 
prime, write a = 3m — l. Then 

焱 (1 -co) = co 2m . 


Proof. Write n — a + bco. The associates of n are 7i, con, co 2 n, —n,— am ，and 
— co 2 7 i. In terms of a and b these elements can be expressed as 


^ w- d ^ o 
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We give a proof for the case of a rational prime q. Since (1 — co) 2 = 
— 3co we have 

X q (l - co ) 2 = x q { - 3 )x q (co). 

By the corollary to Proposition 9.3.4 we know that & (一 3) = 1. By 
remark (b) Xqi 0 ^) — co (Nq ~ 1)/3 = co {q2 ~ l),3 . Thus a( 1 — co) 2 = co iq2 ~ 1)/3 . 
Squaring both sides yields 

^(1 - f 0 ) = co ( 2 馳 2 -”. 

Now, q 2 - 1 = 9m 2 — 6m so that j(q 2 — 1) = —4m = 2m (3). The 
result follows. For extensions of these results to primary elements see 
exercises 17 to 20 on page 135. 


§4 Proof of the Law of Cubic Reciprocity 

Let 7T be a complex prime such that Nn = p = l (3). Since D/nD is a finite 
field of characteristic p it contains a copy of Z/pZ. Both D/nD and Z/pZ have 
p elements. Thus we may identify the two fields. More explicitly the identifi¬ 
cation is given by sending the coset of n in Z/pZ to the coset of n in D/nD. 

This identification allows us to consider ^ as a cubic character on Z/pZ 
in the sense of Chapter 8 [see Proposition 9.3.3, parts (c) and (d)]. Thus we 
may work with the Gauss sums g a (x n ) and the Jacobi sum J(x n ， x n ). 

If X is any cubic character, we have proved (see the corollary to Proposition 
8.3.3 and Proposition 8.3.4) that 

⑻ g(x) 3 = pJ(x, x)- 

(b) If J(x, z) = a + bco, then a = — 1 (3) and b = 0 (3). 

Since J(x, X)J(L X) = P, the second assertion says that J(x, x) i s a primary 
prime in D of norm p. 

We need a lemma. Assume n is primary. 

Lemma 1. J(x n , Xn) = ^ 

Proof. Let J(x n ^ Xn) = n， - Since 冗 元 =p = 冗 ' 元 ' we have 冗 | 兀 ' or n\n'. 

Since all the primes involved are primary we must have 丌 = 兀 ' or 丌 = 元 '. 
We wish to eliminate the latter possibility. 

From the definitions, 

h) = Z XnMXni^ ~ = Z x (p_1)/3 (l - x) (p ~ 1)/3 (n), 

X X 

where the sum is over Z/pZ. The polynomial x (p_1)/3 (l — x) (p_1)/3 is of 
degree f(p — 1) < p — 1. By Exercise 11 of Chapter 4 it follows that 
Y,x x(P ~ 1)/3 (1 — x ) iP ~ 1)/3 = 0 (p)- This shows that J(x n ^ Xn) = 0 (7i); i.e., 
7c|7r' and therefore n = n'. □ 

Corollary. g(Xn) 3 = 
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We can now prove the law of cubic reciprocity. We first consider the case 
where = q = 2 (3) and n 2 — n with Nn — p. 

Raise both sides of the relation g(Xn) 3 = P n to the (q 2 — 1)/3 power. This 
gives g(Xn) q21 = (pn) iq2 ~ l)/3 . Taking congruences modulo q we see that 

G(Xn) ql ~ l = X q (P^) (<?)• 

Since x q (p) = 1 this leads to 

g(Xn ) q2 = XqMgiXn) (ql ⑴ 

We now analyze the left-hand side: 

g(Xn ) q2 = XniOC^j = Z Xn(O g2 C g2t (q)- 
Since q 2 = \ (3) and XniO is a cube root of 1 we have 

g(Xn) q2 = g q <Xn) ( 2 ) 

By Proposition 8.2.1 g q2 (x n ) = xHxJ = Xn^giXn)- Thus, combining 
Equations (1) and (2) 

Xn(^)d(Xn) = Xq(^)g(Xn) (qI 

Multiply both sides of this congruence by g(x n )- Since g(Xn)9(Xn) = 

XMP 三 X q MP (Q) 

or 

Xn(q) = A ⑻⑷， 

implying that 

It remains to consider the case of two complex primes n x and n 2 , where 
Nn l = = 1 (3) and Nn 2 = p 2 = ^ (3). This case is handled by essentially 

the same technique, but it is a little trickier. 

Let yi = 元 1 and y 2 = n 2 - Then and y 2 are primary and p l = and 

Pi = n ili - 

Starting from the relation g(x yi ) 3 = Pi7i，raising to the (Nn 2 — 1)/3 = 
(p 2 — 1)/3 power, and taking congruences modulo n 2 , we obtain by the same 
method as above the relation 

Xy.iPl) = Xn 2 (Piyi)- ⑶ 

Similarly, starting from g(Xn 2 ) 3 — P 2 n 2 , raising to the (p x — 1)/3 power, 
and taking congruences modulo we obtain 

Xn 2 (P 2 i) = Xn.iPl 丌 2). 


(4) 
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We also need the relation x yi (Pi) = XmiPiX which follows from Proposition 
9.3.4 since yi = 元 i and P2 = Pi - Now we calculate 

Xn 1 ( n 2)Xn 2 (Piyi) = XnS^XyM) 

=XnS^XnXPl) = XnyiPl^l) 

= Xn 2 ip\) = Xn 2 (Pl^iyi) 

= Xn 2 (^l)Xn 2 (Piyil 

Equating the first and last terms and canceling gives the sought 

for result : 

Xm( K 2 ) — Xn 2 ( n l)- 


by Equation (3) 
by above remark 
by Equation (4) 


§5 Another Proof of the Law of Cubic Reciprocity 

We present a proof of cubic reciprocity using Jacobi sums. This proof is 
somewhat shorter and more elegant than the one given in Section 4. It 
should be noticed, however, that more background material is used. 

Consider the case = q, n 2 ― n. Let x n = and consider the Jacobi 
sum J(x, x,..., with q terms. Since 3\q + l we have by Corollary 1 to 
Theorem 3 of Chapter 8, 



g(x ) q+1 

= pj(x, X ， … ， X). 

(5) 

Since g(x) 3 = pn. 





g(x) q ^ 

1 =(p;r) ( ^ 1)/3 . 

⑹ 

Now, recall that 





X ， … ， X) = Z 办 1) 办 2)… X(x q \ 


where the sum is over all x l9 x 2 , .. • ,x q e Z/pZ such that : + x 2 + • • • + 
x q = 1. Consider the term for which x x = x 2 = ••- = Then qx l = 1 
and x(4)x( x i) = 1- Raising both sides to the gth power, and recalling that 
q = 2 (3), yields x(^) 2 z( x i) g = 1 and so x( x i) q = Z(《). Thus the “diagonal” 
term of J(x ， / ， ••• ， Z) has the value x(q). If not all the are equal, there are q 
different ^-tuples obtained from (x l9 x 2 ,.. •, x € ) by cyclic permutation. The 
corresponding terms of J(x, …， x) all have the same value. Thus 

= (^)- ( 7 ) 

Combining Equations (5) ， (6)，and (7) we obtain 

(pn) iq+1)/3 ^PX(q) (q) 


or 


p ( q -2)/3 n ( q+ l)/3 _ ^ ^ q y 
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Raising both sides to the q — 1 power (remember that q — 1 e 1(3)) 

p ((q-2)/3)(q-l) n (g^-l)/3 ^ X (qf~ 1 = X (q) (q) 

Since p ((4_2)/3)(4_ 1} = 1 (q) by Fermat’s theorem and n {q2 ~ l)l2, = x q (n) (q) 
it follows that 

Xq(^) = XnOi ) ⑻ 

and 

Xq(^) = Xn(^- 

Now consider the case of two primary complex primes and n 2 . Let 
7i = 元 i ， y 2 = 元 2 , 卩 1 = 7r 1 y 1 ,andp 2 = n 2 y 2 .ThQnp u p 2 = 1 (3). By Theorem 
3 of Chapter 8 we have 

9(X yi ) P2 = J(X yi ， ... ， XyMXy 2 ^ 

There are p 2 terms in the Jacobi sum. Since p 2 = 1 (3), x p y \ = Z yi - Thus 

[W yi ) 3 ]( P2 - 1)/3 = J(X yi ， ... ， X yi ). ⑻ 

By isolating the diagonal term of the Jacobi sum (as we have done a 
number of times by now) we find that 

J(X yi ， … ， X yi ) 三 X yi (Pi ^ = XyXPl) iPi\ 

Using this and the fact that g(x yi ) 3 = Pi7i ， we obtain from Equation (8) 
the congruence 

Xn 2 iPiyi) = ly^Pl ) (丌 2 ) 

and therefore 

Xn 2 (Piyi) = XyMy ( 9 ) 

Similarly one proves that 

= Xn 2 ip\\ ( 10 ) 

Equations (9) and (10) are the basic relations. From here on one proceeds 
exactly as in Section 4 to the desired conclusion XnS^i) = 

§6 The Cubic Character of 2 


The law of cubic reciprocity can be used to develop the theory of cubic 
residues in the same manner as the law of quadratic reciprocity led to the 
results of Chapter 5, Section 2. We shall forego a development of the general 
theory in favor of a discussion of an illuminating special case. Namely, 
we shall ask for all primes 7rinD for which 2 is a cubic residue. 

To begin with, notice that x 3 = 2 (n) is solvable iff x 3 = 2 (n f ) is solvable 
for any associate of n. Thus we may assume that n is primary. If 7 i = is a 
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rational prime, then x q (2) = 1 and so 2 is a cubic residue for all such primes. 
We assume from now on that n — a + bco is a, primary complex prime. By 
cubic reciprocity /„(2) = Xii 71 )- The norm of 2 is 2 2 = 4. Thus 

n = 7c (4_1)/3 = x 2 {n) (2). 

It follows that = 1 iff = 1 (2). We have proved 

Proposition 9.6.1. x 3 = 2 (n) is solvable iff n = l (2), i.e., iff a = l (2) and 
b = 0 (2). 

It is possible to formulate this proposition in another way. Let n = a + bco 
be a primary complex prime and p = Nn = a 2 — ab + b 2 . Then 4p = 
(2a — b) 2 + 3b 2 . If we set A = 2a — b and B = b/3, then 4p = A 2 + TIB 1 . 
According to Proposition 8.3.2 the integers A andB are uniquely determined 
up to sign. 

Proposition 9.6.2. Ifp = l (3), then x 3 三 2 (p) is solvable iff there are integers 
C and D such that p = C 2 + 27D 2 . 

Proof. If x 3 = 2 (p) is solvable, so is x 3 = 2 (n) and thus 7c = 1 (2) by 
Proposition 9.6.1. We have 

b 

4p = A 2 -i- 27B 2 , where A — 2a — b, B — 

Since b is even, so are B and A. Let D = B/2 and C = A/2. Then p = 
C 2 + 21D 2 . 

Suppose, conversely, that p = C 2 + 27D 2 . Then 4p — (2C) 2 + 27(2Z>) 2 . 
By uniqueness B = 土 2D; i.e., B is even and thus so is b. It follows that n = 
a + bco = 1 (2), and x 3 = 2 ( 兀 ） is solvable. Since D/nD has p = Nn elements 
thereisaninteger/isuchthat/i 3 = 2 (7c). It is now easy to show that h 3 = 2 (p). 
If 7c|/i 3 — 2, then 元 |" 3 — 2 and nn = p\(h 3 — 2) 2 . Consequently, p\h 3 — 2 
and we are done. □ 

As an example take p = 7. Then x 3 = 2 (7) is not solvable since there are 
clearly no integers C and D such that 7 = C 2 + 21D 2 . 

On the other hand, p = 31 = 2 2 + 27.1 2 . Thus x 3 = 2 (31) is solvable. 
Indeed, 4 3 = 2 (31). 


§7 Biquadratic Reciprocity : Preliminaries 

In his second memoir (1832) on biquadratic residues, Gauss stated, without 
proof, the law of biquadratic reciprocity. The proof, he asserted, belonged to 
the mysteries of the higher arithmetic. The details were to be published in 
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a third memoir, which unfortunately never appeared. Subsequently Eisenstein 
published several proofs (1844), using Jacobi and Gauss sums. The basic 
idea is the same as in the cubic case, although the details are more extensive. 
The use of Gauss sums to prove reciprocity laws is due to Gauss himself, 
who utilized them essentially in his sixth proof of quadratic reciprocity. 

Throughout the following three sections D denotes the ring Z[i] of 
Gaussian integers. If a e D then (a) = aD is the principal ideal generated by 
a. By a prime will always be meant a positive prime of Z. Recall from Chapter 1 
that D is a Euclidean ring. Thus if tc is irreducible and n \ ocp then either n \ a 
or n\p. If AT(a) = aa is the norm of a then by Exercise 32 of Chapter 1, 
N(ol) = 1 iff a is a unit. From this, one sees that the units of D are 土 1 ，士 

Lemma 1. Ifn is irreducible then there is a prime p e Z such that n\p. 

Proof. N(n) = n 亓 =n = ... p s ， prime, e Z. Thus 7c|pj for some i. 匚 

Thus the irreducibles are found by decomposing in D all primes in Z. The 
following lemma is useful. 

Lemma 2. If cl g D, and N(oi) is prime then a is irreducible. 

Proof. If a = /U then N(ol) = N(fi)N(X). Since N(oi) is prime it follows 
that N(ju) = 1 or N(A) = 1. Thus either /i or A is a unit. 

Lemma 3. 1 + i is irreducible and 2 = — i(l + f) 2 is the prime factorization 
of 2 in D. 

Proof. N(1 + 0 = 2 and so the first assertion follows from Lemma 2. 
The second assertion results from a direct calculation. 

Lemma 4Afq = 3 (4) is a prime in Z，then q is irreducible considered as an 
element of D. 

Proof. If q were not irreducible in D, then q = ap with N(<x) > 1 and N(fi) > 1. 
Taking norms we find g 2 = N(<x)N(P). It follows that q = N(oc). If a = a + bi 
with a, beZ, then q = a 2 + b 2 . This is a contradiction since a sum of two 
squares in Z is congruent to 0 or 1 modulo 4, and q is congruent to 3 modulo 4. 

Lemma 5. If p is prime, p = 1 (4) then there is an irreducible n such that 
p = nn. Furthermore (n) ^ ( 亓 ) . 

Proof. The first statement is part (a) of Proposition 8.3.1. Another proof not 
using Jacobi sums is the following. Since p = 1 (4) there is, by Proposition 
5.1.2, an integer a with a 2 = — l(p). Thus p\a 2 + 1 = (a + i)(a — i). If p 
were irreducible then p\a + i which is absurd. Thus p = aj8, N(oc) > 1, 
m) > 1. Taking norms enables one to conclude that p = N(a). Since N((x) 
is prime it follows by Lemma 2 that a is irreducible. The fact that (a) ^ (a) 
is left as an exercise. □ 
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This completes the description of the irreducibles in D. 

Definition. A nonunit oce D is primary if a 三 1 (1 + i) 3 . 

Lemma 6. A nonunit a is primary iff either a = 1 (4 )， b = 0 (4) or a = 3 (4 )， 
b = 2 (4). 

Proof. Since (1 + i) 3 = 2i(l + i) it follows that a + bi is primary iff 

(a — 1) -h bi a + b — 1 b — a+1. ^ 

— ^ — ^ - = - - - 1 --- i 6 Z). 

2 -h 2i 4 4 

This is equivalent to the congruences a + b = l (4 )， a — b = l (4). The 
result follows easily from this. □ 

We note that any nonunit a = 1 (4) in £) is primary. Furthermore if a 
is primary then (1 + i ) 氺 a. If 分 is a real prime, = 3 (4) then —qisa, primary 
irreducible. As for the irreducibles arising from primes p 三 1 (4) one has the 
following important result. 

Lemma 7. Let cl £ D be a nonunit, (1 4 - Then there is a unique unit u 

such that m is primary. 

Proof. There is a unit e such that m —abi where a is odd and b is even. 
Multiplying if necessary by 一 1， Lemma 6 shows that a has a primary 
associate. If u t and u 2 are units such that i^a and w 2 a are primary then since 
(1 + 0 氺 a it follows that u l = w 2 (1 4 - i) 3 - An examination of cases shows 
easily that this implies = u 2 - 

Lemma 8. A primary element can be written as the product of primary ir¬ 
reducibles. 

Proof. Let a g D be primary. Then there are rational primes q t = 3 (4 )， 
primary irreducibles iV(7c f ) = 1 (4) and a unit u such that a = un l --- 
兀丈 ( 一分 l) •. •( — 分 s). Reduction modulo (1 + /) 3 shows that 1 = w (1 + i) 3 . 
This implies that u = 1. □ 


§8 The Quartic Residue Symbol 

Consider an irreducible n in D. 

Proposition 9.8.1. The residue class ring D/nD is a finite field with N(n) 
elements. 

Proof. The proof proceeds in exactly the same way as Proposition 9.2.1, 
replacing the classification of irreducibles in Z[co] by the corresponding 
classification in D = Z[i]. □ 
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Corollary. Ifnjf at hen oc Nn ~ i = 1 (7r). 

Proposition 9.8.2. If 7i 氺 a ， （兀 ） ^ (1 + i) there exists a unique integer j, 
0 <j < 3 such that 

a (難 )-1 )/ 4 三 〆 （兀) . 

Proof. It is easy to see that the residue classes of 1, — l,i, —i are distinct. They 
are the roots of x 4 = 1 ( 7 c). However the residue class of a (N(n)-1)/4 is also a 
solution to x 4 = 1 (n) by the above corollary. The result follows from this. 

□ 

Definition.If n is an irreducible, N(n) ^ 2, then the biquadratic (or quartic) 
residue character of a, for 丌 氺 a, is defined by Xni 01 ) ― ^ where j is determined 
by Proposition 9.8.2. If 7i|a then x n (a) = 0. 

Proposition 9.8.3. 

(a) If uXol then y.(a) = 1 <^> x 4 = a (7r) has a solution in D. 

(b) = ⑽). XniPl 

(C) Zn(a) = h ⑹. 

(d) If n is a primary irreducible then Xn(~ 1)= =(—l) (a_ 1)/2 , where n = 
a + bi. 

(e) //a = j8 (n) then / n (a) = xM. 

(0 Inipi) = ZA(a) if (n) = (/l). 

Proof. Part (a) follows from Proposition 7.1.2. Parts (b), (c), (e), and (f) 
follow immediately from the definition. Part (d) follows from Lemma 6 
(see Exercise 38). □ 

Proposition 9.8.4. Let q be prime, q = 3 (4). Then x q (a) = 1 for ae Z, q)(a. 
Proof. N{q) = q 2 . Thus 

X q {a) = a (q2 ~ 1)/4r = { a q ~ l ) {q+l),Ar = 1 (q\ 
by Fermat’s Little Theorem. 匚 

The quartic residue character is generalized as follows. 

Definition. Let a g D be a nonunit such that (1 + f ) 氺 a, and P £ D. Write 
a = Xi where is irreducible. If (a, jS) = 1 define x a (P) by 

= n 

i 

This is well defined by Proposition 9.8.3(f). By part (e) of that proposition 
one sees that if j? = y (a) then xM = Za(y)- 

Proposition 9.8.5. Let a g Z, a ^ 0, and ae Zbe an odd nonunit. If (a ， a) = 1, 
then 

Za(a) = 1. 
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Proof. We may assume a > 0. Write a = f] p, PJ q t where p,-, are prime, 
p t = 1 (4) and q i = 3 (4). By Proposition 9.8.4 we need only verify that 
X Pi (g) = 1. If pi = nn where n is irreducible then (a) = Z^(a)^(a)= 
Xni^Xni 01 ) = 1 by Proposition 9.8.3(c). 

Proposition 9.8.6. Ifn ^ 1 is an integer n = 1 (4), then x„(i) = ( — l) (w ~ 1)/4 . 

Proof. Note that n may be negative. If n is a positive prime p = 1 (4) then 
writing p = nii one has 

Zp(0 = xM)Xn(0 = 0* (p_1)/4 ) 2 = (-1) (P_1)/4 . 

If on the other hand n = —q,q = ^(4) and prime, then x~ q (0 = i (q2 ~ 1)/4 = 
( 卢 -i )(4 +1)/4 = (— 1 )( 1 - 1)/4 If n 三 1 (4) is arbitrary then one may write 
n = Pi ••- p t ( — Qi)'' .( — q s \ = 1 (4), q t = 3 (4). The result then follows 
from Exercise 44. □ 


§9 The Law of Biquadratic Reciprocity 

The general law of biquadratic reciprocity may be stated as follows. Let X 
and n be relatively prime primary elements of D. Then 

Theorem 2. XnW = h ⑻ (-1)(( 華一訓 (_ 卜 1)/4) . 

If X and n are primary, where X = c di and n = a bi 9 it is simple to 
see that ((iV(/l) - - 1)/4) and ((a - l)/2)((c - 1)/2 have the 

same parity, so one may write 

hW = hW(-i) ((fl-1>/2)((c-1)/2> . 

In other words if either 丌 or A is congruent to 1 modulo 4 then n and X have 
the same biquadratic character. If however both are congruent to 3 + 2i 
(see Lemma 6) then n and A have “opposite” character in the sense that 

XnW = - Xx( n )- 

Consider a primary irreducible n with N(n) = p = 1 (4) and let x n be the 
associated quartic residue character. Then x n may be viewed as a multiplica¬ 
tive character on the finite field D/nD = F. Recall that Fisa finite field with p 
elements consisting of the residue classes of 0, 1， . • • ， p — 1. If C = e 2ni/p let 
g(Xn) = Xn(M j be the Gauss sum belonging to If ^ ^ then ^ 

is the nontrivial character of order 2 on F and thus is the Legendre symbol. 

Proposition 9.9.1. h) = Xn(~^(Xn^\ 

Proof. By Theorem 1, Chapter 8, one has J(x n , Xn) = GiXn) 2 /^) - Thus 

J(Xn^Xn) 2 = ^TJT2 = Xn (— l)J(Xn ， XnVH) 
using Propositions 6.3.2 and 8.3.3. This gives the result. □ 
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Proposition 9.9.2. g(x n ) 4 = Mm) 2 . 

Proof. This follows immediately from Propositions 9.9.1 and 8.3.3. □ 

Proposition 9.9.3. -^(-1)7(^, Xn) is primary. 

Proof. Clearly 

(p_1)/2 / n + 1\ 2 

JU = 2 ^(0^(1 - 0 + Xtt ( — 2 — ) • 

But any unit in D is congruent to 1 modulo 1 + i. Also p = 1 (2 + 2i). 
Finally UiP + D/2) 2 = (Xn(2~ 1 )) 2 = ^(2)~ 2 = z,(2) 2 = Z7t (-i(l + i) 2 ) 2 = 
Xn(-i ) 2 = Z„(-l). Thus 

J(Xn^ Xn) = - j + h (- 1) (2 + 2i) 

= 一 2 + 以 ― 1) (2 + 2i). 

Thus 

-Zn(-lVte x) = 2^( - 1) - 1 (2 + 20 

=1 (2 + 2/), 

since/〆 一 1) = ±1. □ 

The next proposition identifies the primary element —Xn(~^V(Xm Xn)- 
Proposition 9.9.4. zJ = ^ 

Proof. By Lemma 7 of Section 7 it is enough to show that the left- and right- 
hand sides differ by a unit. Now J(x n ， Lt ) 三 t ip ~ 1)/4 (l — t ) (p ~ 1)/4 (n). 
By Exercise 11 of Chapter 4 it follows that J(x n , Xn) — ^ (n). By the corollary 
to Theorem 1 of Chapter 8, N{J(% n ， Xn)) — P- Thus J(Xir> Xn) i s irreducible 
and the proposition is complete. □ 

Combining Proposition 9.9.4 with Proposition 9.9.2 gives the factoriza¬ 
tion of g(x) 4 in D. 

Proposition 9.9.5. g(x n T = 兀 3 元 . 

We will now prove two particular cases of the law of biquadratic re¬ 
ciprocity. The general statement will then be a formal, if somewhat tedious, 
consequence. 


Proposition 9.9.6. Let q > Obe a real irreducible in D. Then 


= 厶 ( 丌 ) • 
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Proof. Since = 3 (4) one has 

d(Xnf = Z Xn(j)V j = Z Xn(j)C qj (q) 

J=1 

= Xn(Q)g(Xn) (^)* 

Thus 

(d(XnT) (q+l)/4 = 9(Xn) q+l = Xn(q)g(Xn) - d(Xn) (^f)- 

By the observation following Proposition 8.2.2 and noting (see Exercise 45) 
that 元 = n q (q) one has, by Proposition 9.9.5 

兀 [(4+3)(4+1>]/4 ^；^(-1)；^(咖 4+1 ⑷ 

or 

^ 2 - D /4 _ Xn (一 q )( q ). 

But 7l (q2 ~ 1)/4 = Xq( n ) (4) - Thus 

x q (^) = Ltd ⑻， 

which implies, since both sides are units, that 

XgM = Xn(-^y 

This completes the proof. □ 

Notice that —q is a primary irreducible and (N(q) — 1)/4 = (q 2 — 1)/4 is 
even. Thus Proposition 9.9.6 is indeed a special case of biquadratic re¬ 
ciprocity. 

Proposition 9.9.7. Let q be prime q = I (4). Then x n (o) == 心 ( 兀 ). 

Proof. Since = 1 (4) 

9(Xn) q = E Xn(j)V j ^ I XnUX qj ^ MdiXn) (^)- 

Thus 

g(Xn) q+3 = (O)- 

By Proposition 9.9.5 this becomes 

(7C 3 7T) (9+3)/4 = Xn^)^ (Ql 

Both sides of this congruence belong to D and (q, n) = (q, n) = 1. Thus we 
may divide to obtain 

(兀， - ”， 4 ( 元产 _ E L _). 

If = AI where A is a irreducible in D then this implies 

Za(^ 3 )Za(^) = XJS W- 
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As in the previous case we conclude that 

丌 3 )h ( 元 ) = Xni^y 

This may be written as 

元 ) = LciQ) 
or 

Xl( 元 ) = Tn{4) 

which gives, by definition 

心(元） = 又 M . 

Taking conjugates completes the proof. [ 

The reader should notice that in Proposition 9.9.7, q is not irreducible 
and that the left-hand side is the generalized biquadratic residue symbol. 

The following proposition is a formal exercise using Lemma 8 of Section 7, 
and Propositions 9.8.6, 9.9.6, and 9.9.7. 

Proposition 9.9.8. Let a be real and a = 1 (4) and X be primary, (A, a) = 1. 
Then x a {X) = x k (a). 


Suppose now that n = a + bi and A = c + di are primary and relatively 
prime. We do not assume that N(n) ^ iV(A), or that they are irreducible. 


Proposition 9.9.9. If (a ， b) = 1, (c, d) = l then 

⑽) = hW(-i 产職 

Proof. The hypothesis implies that {a 9 n) = (b ， n) = (c, X) = (d,X) = 1. 
The relation cn = ac + bd (A) implies (ac + bd, X) = (ac -f bd,n) = 1. Further¬ 
more 


Similarly 


= x k {ac + bd). 


⑴ 


Xni^XnW = Xni^C + bd). (2) 

Taking the conjugate of (2) and multiplying by (1) one obtains the relation 

⑻ = Xxn(ac + bd). 

Thus we have shown, using Proposition 9.8.3(c) 

XMXM = Xx(c)Xn(a)Xxn(cic + bd). ⑶ 

Assume that c, a, and ac + bd are nonunits. The three terms on the right- 
hand side are easily computed. For an odd integer n put e(n) = (— l) (n ~ 1)/2 . 
Then e(n)n = 1 (4) and s(ac + bd) = e(a)s(c) since M e 0 ⑷. Writing 
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X a (x) = Xa( £ ( x ))xJ< s ( x ) x ) f or each term on the right-hand side of (3) one 
obtains, noting that z a (s(x)) = xd^M) and using Proposition 9.9.8 and 
9.8.3(b) 

XxMXnW = + ( 乂元 ). （ 4 ) 

As for the last three terms one computes, using Proposition 9.8.5 

IcQ) = Xc(c - di) = Xc(-di) = Xc(il 
Xai^) = Xa(a + bi) = Xa(bi) = lai}\ 

Xac + bd(^) = Xac + bd((ad - bc)i) = Xac + bdi}\ 

Thus we have the relation 


XMXnW = l(ac + bd)aJS) 

=(—l) ((fl ~ 1)/2)((c_ 1)/2) . (Proposition 9.8.6) (5) 

The last equality is a simple exercise using Lemma 6 of Section 7. We leave 
to the reader the simple task of carrying through the situation in which one 
of a, c, or ac + W is a unit. 匚 



The general law of biquadratic reciprocity follows easily from Proposition 
9.9.9. For write n = m(a -f bi\ X = n(c + di) ， (n ， X) — \ where m = n = 1 (4), 
0, b) = 1, (c, d) = 1. By Proposition 9.9.8, xjn) = x n {n) and Xa ⑽ = Xm(^ 
Also Xm( n ) = Xn( m ) = 1 by Proposition 9.8.5. Then, since a 4 - bi and c + di 
are primary, 

hW = Za(^)Za(« + bi) 

= XmWx n (a + bi)x c+di (a + bi) 

=XmWXa + bi(n)Xa + bi(c + 山 •)( - 1 广 _ 1)/2 )(( C_ _ 

=遍-1 广 1 爾一 1>/2) 

= △(；!)(_ 1)((_)- 1)/4)(澤卜 1)/4)^ 

where in the last line we have used the fact that m = n = l (4). This completes 
the proof, a monument to ingenuity and persistence! □ 


§10 Rational Biquadratic Reciprocity 


Throughout this section p and q denote distinct primes congruent to 1 
modulo 4. Then the multiplicative group (Z/pZ)* has a unique subgroup 
of order (p — 1)/4 consisting of the residues of fourth powers of integers. 
Consider the biquadratic residue character defined by means of an 
irreducible n in Z[i] dividing p. By Proposition 9.8.3 XniQ) = 1 iff x 4 = ^ (n) 
has a solution with x g Z[i]. 
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Lemma 1. x n {q) = I iff x 4, = q(p) has a solution with x e Z. 

Proof. By Proposition 9.8.1 the integers 0, 1 ， 2,…， p — 1 form a complete 
set of residues for the residue classes of Z[i] modulo n. Thus Xnio) = 1 iff 
x A = q (n) has a solution with x g Z. It follows that x 4 = q ( 元 ) • However, 
( 71 ， 元 ） =1. Thus p = nn\x 4 ' — q. □ 

Let i// p denote the quadratic residue character. 

Lemma 2. If = 1 then Xnio) = 土 1. 

Proof. Since q( p _ i)/2 = 1 (p) it follows that Xn(o)= ( 分 (p 1)/4 ) 2 = q ip ~ 1)/2 = 

1 (7c). Thus xl(q) = I- □ 

Thus, assuming that ^ is a square modulo p, Xnio) is +1 or — 1 according 
as q is or is not a fourth power modulo p. By the law of quadratic reciprocity 
il/ q (p) = +1. Notice that the value x n (4) depends only on p and q and not 
on the choice of the irreducible n. Contrary to what one might expect the 
relationship between the two integers XniQ) and x^(P) where X is an irreducible 
dividing q is not a simple consequence of the law of biquadratic reciprocity. 
In 1969 K. Burde [102] discovered the following remarkable reciprocity 
law. Since p and q are congruent to 1 modulo 4 we may write p = a 2 + b 2 , 
q — c 2 + d 2 , where a = c = 1 (2) and b = d = 0 (2). Throughout the 
following we assume ij/ q (p) = 1. 

Theorem 3. x n {q)nip) = (-l) iq ~ l)/4 il/ q (ad - be). 

The following elegant proof is due to K. Williams [244]. The law of 
biquadratic reciprocity is not assumed. However the value of the quadratic 
Gauss sum is used (Chapter 6, Section 4). The following proposition is of 
interest in itself. (See the comment at the end of Section 12). 

Proposition 9.10.1. Let n be the primary irreducible dividing p. Then 

g(Xn) 2 = -( - 旷 ” 
where yjp denotes the positive square root. 

Proof. By Proposition 9.9.4 and Theorem 1， Chapter 8 we have 

n(y 

J(X„ Xn) = = 

The proposition follows from Theorem 1, Chapter 6 and the observation 
that z^-l) = (-l) (p_1)/4 . □ 

Proposition 9.10.2. If n is a primary irreducible dividing p then Xn(q)X)S<P )= 
n (q ~ 1)/2 (q). 
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Proof. We have, in the ring of all algebraic integers, 

9(Xn) q = (I Xn(M j ) q 

=Z Xn(M qj (Q) 

=Xn(q~ ⑷ 

= Xn(q)G(Xn) (O)' 

The last congruence follows because 

X ) = Xn(Q) = = Xn(q)- 

Thus, multiplying by g(x n ) 3 

g(Xn)\g(Xn)) q ~ 1 = Xni^diXn) 4 (^)- 

The two terms on the left-hand side are in Z[i] by Proposition 8.3.3; and 
by Proposition 8.2.2 N(g(x n ) 4 ) = p 4 . Thus one may cancel g(x n ) 4 obtain 

giln) q ~ l = Xn(Q) Ca¬ 
using Proposition 9.10.1 one obtains 

(g(Xn) 2 ) iq ~ 1)/2 = p (q ~ i)/2 = x n {q) 

But p {q ~ 1)/4 - = Xx(p) (A) and since both sides of this congruence are real it 
follows, taking conjugates and noting (A, 1) = 1, that this congruence holds 
modulo q. This completes the proof. □ 

In the following proposition n is not assumed to be primary. Write 
n = a + bi and X — c di. 

Proposition 9.10.3. n iq ~ 1)/2 = 咖 q (d)ij/ q (ad — be) (q). 

Proof. Since dn = ad — be (A) one has 

(dn ) iq ~ 1)/2 = (ad — bc) iq ~ 1)/2 (A). 

Thus 

il/ q (d)n (q ~ 1)l2 = \j/ q (ad — be) (A). 

Similarly dn = (ad H- be) (I) implies 

il/ q (d)n iq ~ 1)/2 = i// q (ad H- be) (I). 

The proof now follows from the following lemma. □ 

Lemma 3. \j/ q (ad — be) — \f/ q (ad H- be). 

Proof. Since c 2 = —d 2 (q) one has 

中 q (ad — bc)il/ q (ad H - be) = i// q (a 2 d 2 - b 2 c 2 ) = i// q (d 2 p) = \j/ q (p) = 1 •匚 
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Note furthermore that since ^( — 1) = 1 one has as a consequence of the 
above lemma \l/ q (ad — be) = \l/ q ( — ad H- be) = ij/ q ( — ad — be). Thus in the 
statement of Theorem 3 there is no loss of generality in assuming that n is 
primary. With this assumption one concludes from Propositions 9.10.2 
and 9.10.3 that 

Xn(q)Xx(p) = \l/ q (d)il/ q (ad - be). 

The proof of Theorem 3 is completed by the following lemma. 

Lemma 4. If q = c 2 + d 2 , c > 0, c 三 1 (2) then \l/ q (d) — (— 1) (9_1)/4 . 

Proof. Let \)/ c denote the Jacobi symbol. Then by Proposition 5.2.2 one has 
ij/ q (c) = \j/ c (q) = il/ c (d 2 ) = 1. (Cf. Exercise 26, Chapter 5). But c 2 = —d 2 (q) 
implies c ( ^ -1)/2 = (—1)^ -1)/4 ^ _1)/2 (q). Thus \p q (c) = 1 = ( — l) ( ^ _1)/4 ^(d). 

▲ □ 


§11 The Constructibility of Regular Polygons 

On March 30, 1796 C. Gauss, then almost 19 years old, began a diary in 
which he recorded his mathematical discoveries. The first entry reads 
“Principia quibus innitur sectio circuli, ac divisibilitas eiusdem geometrica 
in septemdecim partes, etc.，” a rough translation of which is “Principles upon 

which the division of a circle into 17 parts depend, etc_More generally 

in his Disquisitiones Arithmeticae, §365， Gauss proves, using “cyclotomic 
periods” that if p is a prime of the form 2” + 1 then a regular polygon with 
p sides is constructible by ruler and compass. 

In this section we give a short proof of this result using Gauss and Jacobi 
sums. 

Generally speaking the constructible complex numbers in our context 
are those numbers that may be obtained from Q by a finite sequence of 
rational operations and the formation of square roots. More precisely 


Definition. A complex number a e C is constructible if there exist sub¬ 
fields of C， Q = Kq (Z K' C K 2 C • • • C K n such that a E ： K n and Ki = 

for some E K t , i = , n. 

Here K(y/p) denotes the field of all complex numbers a H- byj% a,b e K 
(see Exercise 6, Chapter 6). It is easy to see that a is constructible iff the real 
and imaginary parts of a are constructible. Furthermore if a is constructible 

then y/oL is constructible. Let, as usual, C t = e 2ni/t . 

Lemma 1. 〔 2n is constructible，n = 1,2, _ 
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the sum being over all characters of F*. 

Proof. If / = £, the trivial character then e(0) = 1. Thus the result holds for 
t = 0. It is true when / = 1 by Proposition 8.1.3 while the remaining case is 
the corollary to Proposition 8.1.3. □ 

Recall that a Fermat prime is a prime of the form 2 n + 1. 

Theorem 4. If p is a Fermat prime then C p is constructible. 

Proof. If g(x) = [fJo 1 XiOCp is the Gauss sum associated with x then 

Y,g(x) = E (z 

X <=0 \X / 

=1 + (p - 1)C P . 

Thus ( p = (p — 1) _1 ( —1 + [z d(x)) an d therefore C p is constructible if each 

g(x) is. 

However p — l = 2 n and since the characters form a group of order 
/? — 1 we see that the order of x is 2 m for some m. Then using Proposition 
8.3.3 we have g(x) 2m = x(-l)p«/(x, X)J(X ， X 2 ). .?(X，/) where l = 2 m - 2. 
But J(x, x j ) e Z[C 2 n] so that by Lemma 1 g(x) 2m is constructible. It follows 
that g(x) is constructible and the proof is complete. □ 


§12 Cubic Gauss Sums and the Problem of Kummer 

If p is a prime p = 1 (4) then the simple argument of Proposition 6.3.2 
showed that g(x) 2 = p where 

p~ l / A p~ 1 p~ 1 2nt 2 

.Z = ZC^ 2 = I cos^- 

t=l \PJ f = 0 t = 0 P 

is the classical quadratic Gauss sum. Thus with little effort g(x) was shown to 
be one of the real roots of x 2 — p = 0. Using a more sophisticated argument, 
we have shown in Section 6, Chapter 6 that actually g{y) is always the largest 
root, that is to say g(x) = y/p. 


Proof. Since (C 2 ») 2 = the result follows by induction (C 2 is certainly 
constructible!). 

Lemma 2. 


= 11 / 
t t t 


CN I ， 
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In the case of cubic Gauss sums the matter is more subtle. Let p be a 
prime p = 1 (3) and consider [f;。 1 cos(2nt 3 /p) = G. Write p = n 元 where n 
is a complex primary prime in Z[co] and let be the cubic character associated 
with n as defined in Section 3. 

Lemma 1. G = g(x n ) H- g(Xn)- 

Proof. If C = e 2ni then since G is real, and — 1 = (— l) 3 

g = = z'ai + xn(t) + ut 2 )) 

t=0 t = 0 

=g(Xn) + gill) 

= 9(Xn) + g(Xn) 

=g(Xn) + Xn(-^)d(Xn) 

= G(Xn) + g(Xn)- □ 

Notice that in the above proof x can be any character of order 3. However 
in the following lemma the choice of x n is essential. Write n = a + box 

Lemma 2. G is a real root of x 3 — 3px — (2a — b)p = 0. 

Proof. By Lemma 1， writing x for 

G 3 = g{if + g(x ) 3 + Mx)g(x)(G(x) + g(x)) 

=/77U + p 元 + 3pG 
= 3pG + p(2a — b). 

In the second step we have used the corollary to Lemma 1， Section 4. □ 

Corollary. G is a root of x 3 — 3px — Ap = 0 where 4p = A 2 + 27B 2 , 
A = l (3). 

Proof. This is simply the corollary to Proposition 8.3.4. □ 

Thus G is twice the real part of g(x n ) and is a root of the polynomial 
x 3 — 3px — Ap. In the same manner as above we see that the other roots are 
2 Re (cogixJ) and 2 Re ((o 2 g(x n )). Using the fact that \g(x n )\ = P 1/2 it is a 

simple matter to see that each of the intervals ( —2 N /p, —y/p), ( — y/p, \J~p\ 

and (s/p, 2^/p) contains precisely one of the roots (see Exercise 43). By the 
corollary to Lemma 1, Section 4, the value of g(x n ) is determined up to 1, 
co, or oj 2 . Unable to find an expression for this root of unity for general p, 
Kummer proposed a statistical study of the distribution of those primes for 
which G, say, is the largest root of x 3 — 3px — Ap. He found, for example, 

that among the primes less than 500, G was in the interval {s/p, 2^/p) for 24 
primes. The interval ( — 2y/p, —^/p) contained 7 primes and the middle 
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interval 14 primes. (See [164], Vol. 1, pp. 50, 296, 353.) Putting l x = 

(-2y/p, -Jp\ 1 2 = 13 = Cfp, 2^/p) and letting NfB) be 

the number of primes less than B such that G is in Ij he noted that the ratio 
N^OO ) : N 2 (500 ) : # 3 (500) is roughly 1:2:3. 

However in 1953, J. von Neumann and H. H. Goldstine considering all 
primes (三 1 (3)) less than 9973 arrived at a ratio of roughly 2:3:4 [197]. 
They found iV^lO 4 ) = 138, iV 2 (10 4 ) = 201, iV 3 (10 4 ) = 272. They stated, 
“These results would seem to indicate a significant departure from the 
conjectured densities and a trend toward randomness.” Emma Lehmer ex¬ 
tended the calculations to include the first 1000 primes, p = 1 (3)，and dis¬ 
covered a ratio approximately 3:4:5. [176]. Thus the suspicion arose that 
indeed the values of G are asymptotically uniformly distributed in the three 
intervals. That this is indeed the case was established in 1978 by D. R. 
Heath-Brown and S. J. Patterson in their paper， “ The distribution of Kummer 
sums at prime arguments” [147]. 

We mention that J. W. S. Cassels [108], conjectured a precise expression 
for g(Xn) involving elliptic functions. This conjecture was established by C. R. 
Matthews [186]. Furthermore an explicit elementary expression has been 
obtained for the biquadratic Gauss sum by Matthews [186]. The result of 
Matthews is as follows. Let p be prime p = 1 (4) and write p = nn 9 n primary, 
n = a + bi. Define P = 土 f by ((p — 1)/2)! = p (n). If g(x n ) is the biquadratic 

Gauss sum attached to x n then by Proposition 9A0.l,g(x n ) 2 = ( 一 l) (p_ 1)/4 7^/^. 

Thus g(x n ) = 1)/4 7c^/p where the square root has positive real 

part. Matthews proved that s = — ^^(20(21 b|/a) where (2\b\/a) is the 
Jacobi symbol. See also J. H. Loxton [182], and B. C. Berndt and R. J. Evans 
[93]. 

Notes 

For the early history of cubic and biquadratic reciprocity we note that Euler, 
during the years 1748-1750, conjectured Proposition 9.6.2 concerning the 
cubic character of 2, as well as similar results for the integers 3, 5, and 7. 
He also conjectured that 2 is a fourth power modulo p, p = 1 (4) iff p = 
a 2 + 64b 2 (Exercise 6, Chapter 5) and stated similar results for the primes 3 
and 5. All of Euler’s conjectures concerning these special cases of reciprocity 
were correct, a remarkable example of his “inductive” ability. The general 
biquadratic character of 2 (Exercise 37) was established by Gauss in his first 
memoir on Biquadratic residues (1828) while the general law of biquadratic 
reciprocity was stated in his second memoir on the same subject (1832). 
For further historical comments on the history of these results see the 
paper by M. J. Collision [116]. 

Gauss wrote to Alexander von Humboldt in 1846 that Eisenstein’s 
mathematical talent was such as nature confers upon few in each century. 
In 1844, at the age of twenty-one, Eisenstein published a total of 25 papers in 
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Crelle’s journal. The proofs of cubic and biquadratic reciprocity given in 
this chapter as well as the proofs of quadratic reciprocity given in Chapter 6 
are among them (see [28], [130], [131]). The collected works of this re¬ 
markable genius, dead at 29, are now available. An informative and charming 
account of Eisenstein’s life and research has been given by A. Weil in his 
review of the collected works [239]. One should also read the beautiful 
paper by Weil, “La Cyclotomie, jadis et naguere” [238]. In a later chapter we 
shall prove a generalization of these reciprocity laws, the celebrated Eisen- 
stein reciprocity law. A discussion of Eisenstein’s other proofs of biquadratic 
reciprocity is contained in H. Smith’s report [72]. As far as cubic reciprocity 
is concerned Jacobi claims to have given the proof in his lectures of 1837 
but the first published proof is definitely due to Eisenstein in 1844. The 
dispute over priority appears to have been quite bitter. 

For the actual construction of a 17-sided polygon see Hardy and Wright 
[40] ， p. 61. Gauss’ treatment of cyclotomy is contained in §7 of his Dis- 
quisitiones Arithmeticae [136]. In §335 he mentions that the techniques 
developed there extend to other transcendental functions such as those 

connected with J dx/y/l — x 4 ，the integral arising from arc length on a 
lemniscate. Gauss recorded in his diary on March 21, 1797 that he has 
succeeded in dividing the arc of the lemniscate into five equal parts. In 1827 
Abel was able to show that, as in the case of a circle, the arc of a lemniscate 
can be divided into p equal parts with ruler and compass when p is a Fermat 
prime. For an examination of Abel’s proof from a modern point of view 
see the article by M. Rosen [212]. 

In recent times there has been a renewed interest in rational reciprocity 
laws. The interested reader should consult the survey article by E. Lehmer 
[175] as well as the paper by H. von Lienen [181]. 


Exercises 

1 . If a e Z[co], show that a is congruent to either 0, 1 , or — 1 modulo 1 — oj. 

2. From now on we shall set D = Z[co] and A = 1 — co. For fiin D show that we can 

write fi = ( — - - - n a t \ where a, b, c, and the a t are nonnegative integers 

and the n ； are primary primes. 

3. Let y be a primary prime. To evaluate x y (^) we see, by Exercise 2, that it is enough to 
evaluate^ (— 1), / y (co), / y (A), and x y ( n X where 7dsa primary prime. Since — 1 = (— 1 ) 3 
we have / y ( — 1) = 1. We now consider / y (co). Let y = a + bco and ^et a = 3m ~ l 
and b = 3n. Show that / v (co) = co m+n . 

4. (continuation) Show that / y (co) = 1, co, or co 2 according to whether y is congruent 
to 8,2, or 5 modulo 3L In particular, if 分 is a rational prime, q = 2 (3), then / g (co) = 1, 
co, or co 2 according to whether ^ = 8, 2, or 5 (9). [Hint: y = a + baj=—l-\- 
3(m -f nco), and so y = — 1 -f 3(m -f n) (3A).] 

5. In the text we stated Eisenstein’s result ( 乂 ） = co 2m . Show that / y (3) = co 2n . 
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6 . Prove that 

(a) x y W = 1 for y = 8 , 8 -f- 3co, 8 + 6 co (9). 

(b) x y W = co for y = 5, 5 -f- 3co, 5 + 6 co (9). 

(c) x y W = co 2 for y = 2, 2 -f- 3co, 2 + 6 co (9). 

7. Find primary primes associate to 1 — 2co, —7 — 3co, and 3 — co. 

8 . Factor the following numbers into primes in D: 7, 21, 45, 22, and 143. 

9. Show that a, the residue class of a, is a cube in the field D/nD iff a (yV7t_1)/3 = 1 (n). 
Conclude that there are (Nn — 1)/3 cubes in D/nD. 

10. What is the factorization of x 24 — 1 in D/5D1 

11. How many cubes are there in D/5D? 

12. Show that cok has order 8 in D/5D and that to 2 A has order 24. [Hint : Show first that 
(coA ) 2 has order 4.] 

13. Show that Trisa cube in D/5D iff tt 三 1 ， 2,3,4, 1 + 2co, 2 + 4co, 3 + w，or 4 + 3co (5). 

14. For which primes ne D is x 3 = 5 (n) solvable? 

15. Suppose that p = 1 (3) and that p = nn, where 7 c is a primary prime in D. Show that 
x 3 = a (p) is solvable in Z iff x„(^) = 1* We assume that ae Z. 

16. Is x 3 = 2 — 3co (11) solvable? Since D/1 ID has 121 elements this is hard to resolve 
by straightforward checking. Fill in the details of the following proof that it is not 
solvable. — 3co) = / 2 - 3 co(H) and so we shall have a solution iff x 3 = 11 (2 — 3co) 
is solvable. This congruence is solvable iff x 3 = 11 (7) is solvable in Z. However, 
x 3 = a (7) is solvable in Z iff a 三 1 or 6 (7). 

17. An element y e D is called primary if y = 2 (3). If y and p are primary, show that 
—yp is primary. If y is primary, show that y = 士 •. . %， where the y t - are (not 
necessarily distinct) primary primes. 

18. (continuation) If y = 士 yiy 2 …％ is a primary decomposition of the primary 
element y, define / 7 (a) = x yi (a)Xp(a)... Z V( ( a )- Prove that Z 7 ( a ) = X y (P) if a = j? (y) 
and x Y (afi) = x y (a)/ y (/]). If p is primary, show that x P ⑹ X y ⑹ = ⑻. 

19. Suppose that y = A -h Bco is primary and that A = 3M — 1 and B = 3N. Prove 
that / 7 (co) = co M+N and that x y W = oj 2M . 

20. If y and p are primary, show that x y (p) = X P (y)- 

21. If y is primary, show that there are infinitely many primary primes n such that 
x 3 = y (n) is not solvable. Show also that there are infinitely many primary primes 
7 i such that x 3 = co (n) is not solvable and the same for x 3 = A (n). (Hint : Imitate 
the proof of Theorem 3 of Chapter 5.) 

22. (continuation) Show in general that if 7 e D and x 3 = y (n) is solvable for all but 
finitely many primary primes n, then 7 is a cube in D. 

23. Suppose that p = 1 (3). Use Exercise 5 to show that x 3 = 3 (p) is solvable in Z 
iff p is of the form 4p = C 2 -f 243B 2 . 
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The following three exercises give K. Williams’ elegant proof of the complex case of the 
supplement to the law of cubic reciprocity [245]. The reader may wish to consult the 
hints at the end of the book. 


24. Let n = a bco be a, complex primary element of D = Z[co]. Put a = 3m — 1 ， 
b = 3m, p — N(n). 

(a) {p — 1)/3 = -2m-\-n (3). 

(b) (a 2 — 1)/3 = m (3). 

(c) X n (a) = oT . 

(d) X n (a + b) = co 2n x n (l - co). 

25. Show that Xa+b( n ) may be computed as follows. 

(a) Xa + fcW = Xa + b(^ - ^). 

(b) x a + b (n) = 

26. Combine the previous two exercises to conclude that /„(! — co) = a) 2m . 


The following four exercises are taken from Matthews [186]. 

27. Let 7r = a -f- 6/ be a primary irreducible in Z[i], b ^ 0. Show 

(a) a = (4),p = N(n). 

(b) b = l - ( 一 1 产一 1>/4 (4). 

28. The notation being as in Exercise 27 show /„(7c) = /„(2)/„(«). 

29. By Exercise 27, a(—l) {p ~ i),4 is primary. Use biquadratic reciprocity to show 

30. Use the preceding two exercises to show 乂乂元） = —2)( — l) (fl2_1)/8 . 

31. Let p be prime, p = 1 (4). Show that p = a 2 + b 2 where a and b are uniquely 
determined by the conditions a = l (4), b 三 —({p— 1)/2)! a (p). 


The following five exercises are taken from Eisenstein [130], §9. 

32. Let p be prime, p = 1 (4) and write p = nn y ne Z[Q. Show / p (l -{- i) = i (p_1)/4 . 

33. Let 分 be a positive prime, q = 3 (4). Show / q (l -{- i) = / ( « +1)/4 . [Hint: (1 -f i) q ~ l = 

— i (<?)•] 

34. Let n = a bi be sl primary irreducible, (a, b) = 1. Show 

(a) if 7r = 1 (4) then /„(«) = / (a_1)/2 . 

(b) if 7r = 3 -H 2i (4) then x n ( a ) — —i { ~ a ~ l),2 . 

35. If n = a + bi is as in Exercise 34 show Xn( a )X 7 r(^ + /) = i (3(a + fc_1))/4 . [Hint: 
a(l - i) = a b - i(a -f bi). Generalize Exercises 32 and 33 to any integer 
=1 (4) and use Proposition 9.9.8. Note a + 办三 1 (4).] 

36. Remove the restriction (a, b) — l in Exercise 34. 

37. Combine Exercises 32, 33, 34, and 35 to show /„(1 + i) = i (a_<>_&2_1)/4 . Show that 
this result implies Exercise 26 of Chapter 5 (the “biquadratic character of 2”). 

38. Prove part (d) of Proposition 9.8.3. 

39. Let p = 1 (6) and write 4p = A 2 + 27B 2 , A = 1 (3). Put m = (p — 1)/6. Show 
( 3 D^ -A(p)o2\B. 
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40. Let p = 1 (6 )， and put p = nn where n is primary. Write n = a bco and show 

(a) If X n {2) = co then 2b - a = - ( 3 J) (p). 

(b) If h(2) = co 2 then a + b = ( 3 ^) (p). 

(c) If Xn(2) = ojput A = 2a-b,B = b/3. Show (A - 9B)/2 = ( 3 m m ) (p). 

(d) If / rt (2) = co 2 put 2a — b = A and B = —b/3. Show (A — 9B)/2 = ( 3 :) (p). 

(e) Show that the “normalization” of B in (c) and (d) is equivalent to ^ = B (4). 
[Recall / rt (2) = n (2) by cubic reciprocity.] 

41. Let p 三 1 (6 )， 4p = 乂 2 + TIB 1 ，A = 1 (3), A and B odd. Put n = a -bco, 2a — 
b = A，b = 3B. Let be the cubic residue character. 

(a) If z rt (2) = co show N(x 3 + 2y 3 = l) = p+ l-\-2b — a = 0(2). 

(b) If / rt (2) = oj 2 show N(x 3 + 2y 3 = l) = p-\-l—a — b = 0 (2). 

(c) Show that if A = B (4) then, assuming /„(2) ^ 1, one has /„(2) = co. 

(d) If/^2) ^\,A = B (4) then 2 (p_1)/3 三 （一 A - 3B)/6B = (A-h 9B)/(A - 9B) (n). 
(This generalization of Euler’s criterion is due to E. Lehmer [174]. See also 
K. Williams [243].) 

42. The notation being as in Section 12 show that the minimal polynomial of g ( x n ) is 
x 3 — 3px — Ap. 

43. Find the local maxima and minima of x 3 — 3px — Ap and show that each of the 

intervals ( — 2^/p, — ^/p\ ( — ^/p, ^/p), (^/p, 2 v /p) contains exactly one of the 
values 2 Re ((o k g(x n )\ /c = 0, 1, 2. 

44. Let n 6 Z, n = • • • s n n = 1 (4), s t = 1 (4), / = 1， ... ， Show (n — 1)/4 = 

lUife- 0/4(4). 

45. Let n = a bis Z[i] and ^ = 3 (4) a rational prime. Show n q 三元 (q). 



Chapter 10 

Equations over Finite Fields 


In this chapter we shall introduce a new point of view. 
Diophantine problems over finite fields will be put into the 
context of elementary algebraic geometry. The notions of 
affine space, projective space, and points at infinity will be 
defined. 

After these problems of language have been dealt with, 
we shall prove a very general theorem due to C. Chevalley, 
which states that a polynomial in several variables with 
no constant term over a finite field always has nontrivial 
zeros if the number of variables exceeds the degree. 

Next, our interest turns to the problem of generalizing 
the results of Chapter 8 to arbitrary finite fields. This 
turns out to be relatively easy. These more general results 
are of interest for their own sake and are crucial to the 
discussion of the zeta function, which we shall take up in 
Chapter 11 . 


§1 Affine Space, Projective Space，and Polynomials 

Let F be a field and A n (F) the set of n-tuples (a 1? a 2 , • • • ， ％) with a { e F. 
A n (F) can be considered as a vector space by defining addition and scalar 
multiplication in the usual way. We shall be concerned principally with the 
underlying set, which will be called affine n-space over F. As usual the point 
(0, 0, ... ， 0) will be called the origin. If there is no chance of confusion we 
shall denote the point (a^ a 2 ,... by the single letter a. 

Projective n-space over i 7 , P n (F), is a somewhat more difficult concept. 
We first consider A n+l (F\ denoting its points by (a 0 , a n ). On the 

set A n+1 (F) — {(0, 0, • • • ， 0)} (affine (n + l)-space from which the origin 
has been removed) we define an equivalence relation. (a 0 , a n ) is said 

to be equivalent to (b 0 , b n ) if there is a 7 g F* such that a 0 = yb 0 , 

a x = yb u .. .,a n = yb n . This is easily seen to be an equivalence relation. 
The equivalence classes are called points of P n (F), If a g A n+1 (F) is distinct 
from the origin, then [a] will denote the equivalence class containing a. 
a will be called a representative of [a]. Geometrically, the points of P n (F) 
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are in one-to-one correspondence with the lines in A n+ X (F) that pass through 
the origin. 

If F is a finite field with q elements, then clearly A n (F) has q n elements. 
P n (F) has q n + q n ~ 1 + • • • + + 1 elements. To see this, notice that 
A n+ 1 (F) — {(0, 0,..., 0)} has q n+l — 1 elements. Since F* has q — 1 
elements each equivalence class has q — 1 elements. Thus P n (F) has 
(q n+l — l)/(q — 1) = q n -q n ~ 1 + . •. + 分 + 1 elements. 

In general P\F) has more points than A n (F). This is made more precise 
as follows. If [x] € P n (F) and x 0 i=- 0, set = (xjx 0 , x 2 /x 0 ,..., 

xjx 0 ) e A n (F). This map is easily seen to be independent of the repre¬ 
sentative x. 

Lemma 1. Let H be the set of [x] e P n (F) such t/iat = 0. Then cj) maps 
P n (F) — H to A n (F) and this map is one to one and onto. (If S and T are sets, then 
S — T is the set of elements in S but not in T.) 

Proof . If 0([ x ])=《([>])， then xJx Q = yjy 0 for i = 0 , Lety = y 0 /x 0 . 

Then yx t = for i = 0, 1,..., n and so [x] : =M. 

lfv = (v u v 2 ,...,v n )e A\F\ set w = (l,v l9 v 2 ,...,v n ). Then </>([w]) = v. 


The set H is called the hyperplane at infinity. It is easy to see that H 
has the structure of P n ~ l (F). Thus P n (F) is made up of two pieces, one a 
copy of A n (F), called the finite points, and the other a copy of P n ~ i (F), 
called the points at infinity. 

Notice that P°(F) consists of just one point. Thus P 1 (F) has only one 
point at infinity. Similarly P 2 (F) has a (projective) line at infinity, etc. 

Now that affine space and projective space have been defined we take 
up the subject of polynomials and see how they determine sets called hyper¬ 
surfaces. 

Let F[x 1? x 2 ,..., x w ] be the ring of polynomials in n variables over F. 
life F[x u •.. ， xj, then 

= Z …“ 乂 W … 

(i 1 ， ‘2, •• • ， D 

where the sum is over a finite set of n-tuples of nonnegative integers 
(“， … ， i„)，where a ilh ... in ^ 0. A polynomial of the form x \ l x ^ • • • 
is called a monomial. Its total degree is defined to be + i 2 + * * * + h : 
its degree in the variable x m is defined as i m . The degree of f (x) is the maximum 
of the total degrees of monomials that occur in / (x) with nonzero coefficients. 
The degree in x m is the maximum of the degrees in x m of monomials that 
occur in / (x) with nonzero coefficients. Call these two numbers deg / (x) and 
deg m f(x). Then 

(a) deg f(x)g(x) = deg f(x) + deg g(x). 

(b) deg m f(x)g(x) = deg m f(x) + deg m g(x). 
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If all the monomials that occur in / (x) have degree Z, then f(x) is said to 
be homogeneous of degree /. 

For example, if /(x) = 1 + x l x 2 + x 2 x 3 + then deg f(x) = 3, 


deg! f(x) = deg 2 f(x) = deg 3 f(x) = 1, and deg 4 f(x) = 3. /O) is not 
homogeneous, but h(x) = xf + + x! + ^^ 2 X 3 is homogeneous of 

degree 3. 

A homogeneous polynomial is sometimes called a form. A form of 
degree 2 is called a quadratic form, and one of degree 3 is called a cubic 
form, etc. 

Suppose that X is a field containing F. If f(x) e F[x 1? x 2 ,... ， x„] and 
a e A n (K\ we can substitute a t for x t and compute / (a). 

This shows that f(x) defines a function from A n (K) to K by sending a 
to / (a). A point a e A n (K) such that /(a) = 0 is called a zero of / (x). 

If X is a finite field with q elements, then x q — x defines the zero function 
on A l (K). Thus it may happen that a nonzero polynomial gives rise to the 
zero function. This cannot happen when K is infinite (see the Exercises). 

Let / (x) be a nonzero polynomial and define H f (K) = {ae A n (K) |/(a) = 0}. 
H f (K) is called the hypersurface defined by / in A n (K). When K is a finite 
field, H/K) is a finite set and it is natural to ask for the number of points in 
Hf(K). In Chapter 8 we dealt with a number of special cases of this problem. 

We now wish to define a projective hypersurface. Let h(x) e 
Flx 0i x u ... ,x„] be a nonzero homogeneous polynomial of degree d. As 
before, iC is a field containing F. For y e K* we have h(yx) = y d h(x). It 
follows that if a e A n+l (K) and h(a) = 0, then h(ya) = 0. Thus we may 
define H h (K) = {[a] e P n (K)\h(a) = 0}. This set is called the hypersurface 
defined by h in P n (K). Again, if K is finite, we can ask for the number of points 


in H h (K). 

More generally if / 1? ... ,/ m are polynomials in F[x 1? ..., xj define 
V = {(a u • • • ， a n ) I a t e F，i = 1 ， … ， n ， fj{a u ...,a n ) = 0,j = 1,..., m}. V is 
called an algebraic set defined over F. If the ideal defined by / l9 ... ,/ m in 
F[x 1? ..., xJ is prime then V is called an algebraic variety. Similarly, the 
common projective zeros of a finite set of homogeneous polynomials in 
F[x 0 , … ， x"] is called a projective algebraic set. 

It turns out that working with projective space leads to more unified 
results than working with affine space. We shall illustrate this point after 
defining the projective closure of an affine hypersurface. 

Let f(x)e F[x 1? x 2 , • • • ， xj, and define/(y) = /Oo, h ， • • • ，风 ） by 


f(y) = yo ef f 



•5 



We shall see in a moment that/is a homogeneous polynomial. It will give 
rise to a hypersurface in P n (K). Roughly speaking, the new hypersurface will 
be obtained from H f (K) by adding points at infinity. 
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Lemma 2.f(y) is a homogeneous polynomial of degree equal to deg /. Moreover ， 
/(i ， h ， :v2, …， 30 = /(h ，： ^ ， … ， y”). 

Proof. Set d = deg / and consider a monomial • • • x l „ n of degree / < d. 
Then y d o{yJy 0 ) il - - - (y n /y 0 ) in = y d 0 ~ Vi/i 2 - - - which is of degree d. Thus 
in f(y) all the monomials have degree d 9 which proves the first statement. 
The second statement is immediate from the definition. □ 

As examples, if f(x) = x? + x? — 1, then f(y) = y? + v? — Vo* if 
/ ⑻ =；1 + 2x? - 3xl, then f(y) = ^ + - 3y 0 yl ， 

Consider the hypersurface H f (K) <= A n (K). f(y) is homogeneous in 
the variables Jo, Yi ， • • • ， and so / defines a hyper surface Hj(K) in P n (K). 
This projective hypersurface is called the projective closure of H f (K) in 
P n (K). 

Let X\A n {K) P n (K) by a 2 , • • • ， aj = [1 ， a” a 2 ,..., aj. A is one 
to one and moreover the image of H f (K) under A is contained in Hj{K) 
since clearly /([l, a l 5 ...,aj) = f(a u a 2 ,---,a n ) = 0 for all ae H f (K). 
In general Hj{K) has more points than H f (K\ namely, the intersection of 
Hj{K) with the hyperplane at infinity. 

All this will become clearer by means of examples, but before giving 
some we recall the definitions of the maps 0 and A and give a diagrammatic 
picture of P n (K): 

又 : A\K) P\K) by A(a 1? a 2 ,...,aj = [1, «i, a 2 » • • • ? 
(j>:P\K)-H^A\K) by _o ， U])= (-，_，•••，-)• 

P\K) 


im A % A\K) 

H « P n -\K) 

Finite points 

Points at infinity 


Examples 

1. = — \ over the field F = Z/pZ. 

We have seen in Chapter 8 , Section 3, that /(x) = 0 has p — l solutions 
if p = 1 (4) and p + 1 solutions if p = 3 (4). 

f(y) = yj + y I — yl - The solutions \j>o,Pi,P 2 ], where p 0 ^ 0 corresponds 
to the affine solution (pi/p 0 , Pi/Po)- Suppose that is a solution. 

Then pj + pi - 0 or (p 2 /Pi ) 2 = ~ 1. If p = 1 (4), there is an a e F such 
that a 2 = — 1 and in this case there are two points at infinity, namely, 
[0, 1, a] and [0, 1, -a]. If p = 3 (4), there is no a e F such that a 2 = — 1 
and consequently there are no points at infinity. In both cases, then, the 
hypersurface Hj{F) has exactly p + 1 points. 
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2. f (x) = xl -{■ x n 2 — 1 over F = Z/pZ where p = I (n). 

We have f(y) = y\ + y n 2 — yo- Thus the points at infinity on Hj{K) 
are of the form [0, y u y 2 ], where + >^ = 0. If — 1 is not an nth power in F, 
then there are no points at infinity. If a n = — 1 for some a e F, then there are 
n solutions to = — 1 in F [this follows from Proposition 4.2.1 since 
p = l (n)]. Call these solutions a x = a, a 2 , •••，〜• Then [0, 1 ， aj ， 

• • • ， [0, 1， aj are the points at infinity that are zeros of f(y). In the nota¬ 
tion of Chapter 8 , Section 4, the number of points at infinity is (5 W (— l)n, aad 
N(x n { + ^2 = 1) + d n ( — l)n is the number of points on the projective hyper¬ 
surface (curve) defined by y\ + y n 2 — y n o = 0. Since the number of points in 
P l (F) is p + 1 the formula in Proposition 8.4.1 can be interpreted in the 
following way: The number of points on the projective curve y\ y\ — 

= 0 over Z/pZ differs from the number of points on the projective line by 

an error term that does not exceed (n — l)(n — 2)yfp. 

This result is a special case of the so-called Riemann hypothesis for 
finite fields, which states, roughly, that over a finite field with q elements, 
the number of points on a projective curve differs from the number of points 
on the projective line by an error term that does not exceed twice the genus 

(a number associated with the curve) times J~q. 

Special cases of the result were proved by various authors : Gauss, 
G. Herglotz, Hasse, and Davenport. The theorem was proved in full generality 
by Weil. 

3. f(x) = xj + ^2 -h • * * -f — 1 over F = Z/pZ, where m is even and 

p # 2 . 

The number of finite points is given by p m ~ 1 — ( — l) (m/2)((p_ 1)/2) . p (m/2) ~ 1 
(see Proposition 8 . 6 . 1 ). Since f{y) — y\ ^ y\ + yh — yl the number 
of points at infinity is equal to the number of solutions to W + … + 
^ = 0 in P m ~ 1 (F). The number of affine solutions is given by N = p m ~ 1 + 
(—l) (m/2)((p_ 1)/2) (p — l)p (m/2) ~ 1 (see Exercise 19 in Chapter 8) so the number 
of projective solutions is 

1 = p m ~ 2 + p m ~ 3 + ••* + P + 1 + (_l)^/2)((P-D/2 p (m/2)-l 

P - 1 

Adding the number of finite solutions to the solutions at infinity yields 

p m_1 + p m _ 2 + … + p + 1 . 

This result is rather remarkable. It says that the number of points on 
the projective hypersurface given by yi + yl + + yi — yl = 0 is 

exactly equal to the number of points in P m ~ 1 (Z/pZ), 

There is a simpler way to achieve this result. Instead of considering 
the finite and infinite points separately one simply counts the number M 
of affine solutions to yj + yl + •- + — yl = 0 in A m+ 1 (F) and then 

calculates (M — l)/(p — 1). Since m + 1 is odd, the number M is equal 
to p m (see Exercise 19 in Chapter 8 ). Thus (M — l)/(p — 1 ) = p m ~ l -f 

p m — 2 + ••• + p + 1 . 
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§2 Che valley 9 s Theorem 


In this section F will denote a finite field with q elements. 

If ^ is a prime, i.e., F = Z/^Z, the equation xf 1 + x \~ 1 + ... + x\Z \ = 0 
has no solution except ( 0 , 0 , …， 0 ) because a q ~ l is equal to 1 or zero de¬ 
pending on whether a ^ 0 or a = 0 for ^ e F. Thus the values taken on by 
the above polynomial are 0,1,2,... ,q — 1 and it is zero only ifx x = x 2 = • •• 
== 0. Notice that for this polynomial the number of variables is 
equal to the degree. 

In 1935 E. Artin conjectured the following theorem, which was proved 
almost immediately by C. Che valley [16]. 

Theorem 1. Let f (x) e F[x l5 x 2 ,..., x w ] and suppose that 

(a) /(0, 0,...,0) = 0. 

(b) n > d = deg/ 

Then f has at least two zeros in A n (F). 

Before giving the proof we shall deduce an immediate corollary. 

Corollary. Let h(y) e Fly 0i y u .y„] be a homogeneous polynomial of 
degree d > Q. If n + 1 > d. then H h (F) is not empty. 

Proof. Since h is homogeneous (0, 0,..., 0) is a zero. By Theorem 1 h has 
another zero, (a 0 , a l9 ..a n ). Clearly [a 0 , a„] e H h (F). □ 

We shall need the following elementary lemmas. 

Lemma 1. Let f(x u x 2 , • • •, x n ) be a polynomial that is of degree less than q in 
each of its variables. Then if f vanishes on all of A n (F), it is the zero poly¬ 
nomial 


Proof. The proof is by induction on n. If n = l,/(x) is a polynomial in one 
variable of degree less than q with q distinct roots, namely, all the elements 
of F. Thus /is identically zero. 

Suppose that we have proved the result for n — 1 and consider 


We can write 


-^2 > • • • ， Xfi). 




f(x u ...,x n ) = X] 

i = 0 


1， 


， •^ n- 1 



Select a l9 a 2 ,..., e F. Then [f :。 1 a 2 ,. • • ， a n — has q roots 

and so gi(a u a 2 ，…， = 0. By induction each polynomial g t is identically 
zero and hence so is / •口 
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Remember that f(x) = x q — x is a nonzero polynomial that vanishes 
on all of A 1 (F\ so the hypothesis of the lemma is crucial. 

If a polynomial is of degree less than q in each variable, it is said to be 
reduced. Two polynomials /, g are said to be equivalent if f(a) — g(a) for 
all a e A n (F). We write f 〜 g. 

Lemma 2. Each polynomial f(x) e F[x l9 ..., x n ] is equivalent to a reduced 
polynomial. 

Proof. Consider the case of one variable. Clearly 〜 x. If m > 0 is an 
integer, let / be the least positive integer such that 〜 x’. We claim that 
l < q. If not, l = qs + r with 0 < r < q and s ^ 0. Then x l = (x 9 ) s x r 〜 
+r . Since s + r < / this contradicts the minimality of Z. 

In the case of n variables consider the monomial x\ l x^ - - - x^. By what 
has been said, x\ l x^ … x l n n 〜 x j ^x{ 2 - - - x j n n , where j k < q for k = 1, 2 ,..., n. 
Lemma 2 follows directly from this remark. C 

We are now in a position to prove Theorem 1. Suppose that (0, 0, • • • ， 0) 
is the only zero of f. Then 1 一 f q ~ 1 has the value 1 at (0, 0, …， 0) and the 
value zero elsewhere. The same is true of the polynomial (1 — xj _1 )(l — x| -1 ) 
… （1 — xy 1 ). Thus 

i - 尸 - 1 - (i - xr ii-xr i) … (i 一 3 

vanishes on all of A n (F). Replace 1 — f q ~ 1 by an equivalent reduced poly¬ 
nomial g. Then 

9 — — x \~ D • (1 — 0 

is of degree less than q in each of its variables and vanishes on all of A n (F). 
By Lemma 1 it vanishes identically. Thus deg g = n(q — 1). On the other 
hand, deg g < deg(l — f q ~ 1 ) = d(q — 1). Recall that d = deg f. This 
implies that n < d, which is contrary to the hypothesis. Consequently / 
must have more than one zero. 

We shall give another proof due to Ax [3]. It is based on the following 
lemma. 


Lemma 3. Let i u fee nonnegative integers. Then unless each ij is 

nonzero and divisible by q — 1 we have 

aeA n {F) 

Proof. Suppose first that n = 1. If i = 0, then a° = q = OinF. Suppose 
i ^ 0. F* is cyclic. Let ft be a generator. If q — 1 氺 then 


asF 


q-2 

=I b ki = 


k = 0 


b iq ~ l)i - 1 
b l - 1 


= 0 . 
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In general 

Z 咖 i 2 2 …心 =(e Z ^2 2 ) ••• f Z 

a e A n (F) \fli eF / \o2eF ) \a n eF ) 

Lemma 3 is now clear. □ 

It should be remarked that if — 11 ij and ij ^ 0 for all 7 , then the value 
of the above sum is (q — l) n . 

To return to Theorem 1, let A// be the number of solutions of f(x) — 0 
in A n (F). We shall show that p\N f , where p is the characteristic of F. This 
refinement of Chevalley’s theorem was first given by Warning [78]. 

As we have seen, 1 — f q ~ l has the value 1 at a zero of / and the value 
zero otherwise. Thus 

N f = Y. ( 1 -/( 0 ， 

ae A W (F) 

where Nf is the residue class of N f mod p considered as an element of F. 

Let x\ l x^ • • • be a monomial occurring in 1 — Since this 

polynomial has degree d(q — 1 ) we must have ij 〈分一 1 for some j since 
otherwise the degree of the monomial would exceed n(q — 1 ) and we have 
assumed that < n. By Lemma 3 ^ aev4 n (jF) a\a l i • • • a? = 0. Since 1 — 
is a linear combination of monomials it follows that N f = 0, or p\N f . 

Warning was able to prove that N f > q n ~ d . In a somewhat different 
direction Ax showed that q b \N f ， where b is the largest integer less than 
n/d. See [78] and [3] for details. 


§3 Gauss and Jacobi Sums over Finite Fields 

Let C p = e 2ni/p and F p = Z/pZ. In Chapter 8 the function ij/: F p -^ C given 
by — C P played a crucial role. To carry over the principal results of 
Chapter 8 to an arbitrary finite field F, we need an analog of ij/ for F. This 
is done by means of the trace. 

Suppose that F has q = p n elements. For cleF define tr(a) = a + a p + 
a p2 + • • • + a pn ~ \ tr(a) is called the trace of a. 

Proposition 10.3.1. If a, p e F and a e F p , then 

(a) tr(a) g F p . 

(b) tr(a + j?) = tr(a) + tr(j 8 ). 

(c) tr(aa) = a tr(a). 

(d) tr maps F onto F p . 
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Proof. 

(a) We have 

( a + a p + ... + a p” 々 =+ a p2 + • • • + a ， -1 + a，. 

Since a pn = = a we see that tr(a) p = tr(a). This proves property (a) 

(see Proposition 7.1.1, Corollary 1). 

(b) tr(a + j8) = (a + j8) + (a + )S) P + … + (a + j8) pn_1 

=(a + j8) + (V + W) + ••• + (a， _1 + n 
=(oc + V + ••• + a pn_1 ) + (jS + 俨 + ••• + p pn ’ 

=tr(a) + tr(P). 

(c) tr(aa) = aa + a p ot p + • • • + a pn_1 a pn_1 

= a(cc + oc p + ••• + a pnl ) 

=a tr(a). 

We have used the fact that a p = aforaeF p . 

(d) The polynomial x + x p + • • • + x pn l has at most p n ~ 1 roots in F. 

Since F has p n elements there is an a e F such that tr(a) = c ^ 0. If 
b e F p , then using property (c) we see that tr((b/c)oc) = (b/c) tr(a) = b. 
Thus the trace is onto. □ 

We now define ij/: F -> C by the formula 少 (a) = Cp (a) . If F = F p , this 
coincides with the previous definition. 

Proposition 10.3.2. The function ij/ has the following properties: 

⑻ iKa + 妁 = _ 輸 . 

(b) There is an cce F such that ^(a) ^ 1. 

(c) = 0 . 

Proof. 

⑻棒 + P) = C l ； (a+/?) = C l ； (a)+tr(/?) = Cp ia) Cp ifi) = 

(b) tr is onto, so there is an a e F such that tr(a) = 1. Then 少 (a) = C p ^ 1- 

(c) Let S = X aejF iA( a )- Choose P such that ^ 1. Then = 

Yjolsf <A(jS)<A(a) = + oi) = S. It follows that 5 = 0. □ 

Proposition 10.3.3. Let (x,x, ye F. Then 

- z <A(a(x - y)) = 5(x, y), 

Q oTf 

where 3(x, y) = I if x = y and zero otherwise. 

Proof. If x = y 9 then ^asF - y)) = = q- 

If x ^ y, then x — y ^ 0 and ct(x — y) ranges over all of F as a ranges 

over all of Thus [ aeF il/(ot(x - y)) = ^ g f HP) = 0 by property (c) of 
Proposition 10.3.2. □ 
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Proposition 10.3.3 generalizes the corollary to Lemma 1 of Chapter 6 . 

In Chapter 7 we proved that the multiplicative group of a finite field is 
cyclic. On the basis of this fact, one easily see that all the definitions and 
propositions of Chapter 8 , Section 1, can be applied to F as well as to F p . 
It is only necessary to replace p by q whenever it occurs. Thus we may 
assume that the theory of multiplicative characters for F is known. 

We are now in a position to define Gauss sums on F. 

Definition. Let ^ be a character of F and a e F*. Let g a (x) : =LeF z( 0 ^(a 0 * 
g a (x) is called a Gauss sum on F belonging to the character 

If we replace p by q, Propositions 8.2.1 and 8.2.2 can now be proved for 
the sums g a (x)- In the proof of Proposition 8.2.2 one needs Proposition 10.3.3. 
In particular, we have \gjj)\ = q l/2 and g a (x)g a (x~ 1 ) = for 

The general theory of Jacobi sums and the interrelation between Gauss 
sums and Jacobi sums that is developed in Chapter 8 , Section 5, generalizes 
with no difficulty (just replace p by q everywhere), and all the results of 
Chapter 8 , Section 7, also hold. The reader may wish to go back to these 
sections to assure himself that there are indeed no difficulties in generalizing 
the definitions and results. 

As an exercise in working with these new tools, we present a theorem that 
is really a reformulation of some of our earlier work. This theorem will also 
be of use in Chapter 11 . 

Theorem 2. Suppose that F is a field with q elements and q = l (m). The homo¬ 
geneous equation a 0 yo + -+■••• + a n y^ = 0, a 0 , a u ...,a n e F*, defines 

a hypersurface in P n (F). The number of points on this hypersurface is given by 

q n ~ l + q n ~ 2 + ••• + g + 1 

+ E Xo(«0 X ) ••- Irf^n l Vo(Xo^ Zl，• • • ， Xn\ ⑴ 

q XOt Xlf •••» Xn 

where xT = ^ Xi ^ ^ and XoZi ' - Xn = ^ 

Moreover, under these conditions 

—^-r J o(Xo, Xi^-a n ) = - g(xo)g(xi) - - - Gix n \ ( 2 ) 

q - i q 

Proof. The number of points N on the hypersurface in A n + 1 (F) defined by 
a 0 yo + + … + a n y^ = 0 is given by 

q n + Z Xo(«o 0 … Xni^n ^0(X0 ： i ， … ， L )， 

ZO ， 1 ， • •• ， Zn 

where the characters Xi are subject to the conditions stated in Theorem 2. 
This follows from Theorem 5 of Chapter 8 . The number we are looking for is 
(N — l)/(q — 1) and this yields Equation (1). 
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By Proposition 8.5.1，part (c), we have 

…， L) = Zo(-!)(<? - 
By Theorem 3 of Chapter 8 


*^(Zl5 X2 ，•••，％«) 


dixMxi) -'' g(Xn) 
g(xa2 -in) 


⑶ 



Multiply the numerator and denominator of the right-hand side by g(xo)- 
Since XoXi -'Xn = e 5 we have g(xo)d(XiX 2 * = Xo(~^ Combining this 

comment, Equations (3) and (4) yield Equation (2). □ 


Notes 

There is a pleasant introduction to geometry over finite fields in the book 
Excursions into Mathematics [7]. The authors discuss affine, projective, 
and even hyperbolic geometry. There is also a short but useful bibliography. 

Artin’s conjecture on polynomials over finite fields was made much 
earlier by Dickson (On the Representations of Numbers by Modular 
Forms, Bull Am. Math. Soc., 15 (1909), 338-347). The first proof we gave is 
the original proof of Chevally [16]. The second proof is due to J. Ax [3] 
and is found in M. Greenberg [37] and Samuel [68]. E. Warning’s proof of a 
sharper result can be found in his original paper [78] and in Borevich and 
Shafarevich [9]. 

A. Meyer, in 1884, was able to prove that a quadratic form over the 
rationals in five or more variables always has a rational zero if it has a 
real zero. Hasse was able to prove that the same result, suitably generalized, 
holds over any algebraic number field. E. Artin was led by this and other 
considerations to conjecture that over a certain class of number fields a 
form of degree d in n > d 2 variable always has a nontrivial zero. He also 
made conjectures of this nature over other types of fields. For a discussion 
of this area of research, see the paper of S. Lang [53], as well as the book of 
Greenberg [37], which includes a counterexample to Artin’s conjecture for 
p-adic fields, discovered in 1966 by G. Terjanian. Other counterexamples were 
provided shortly thereafter by S. Shanuel. There is much left to discover in 
this area, which is one of the most fascinating in modern number theory. 

If one looks at the case where the ground field is the field of rational func¬ 
tions over a finite field, then the Artin conjecture mentioned above has been 
proved by Carlitz [11]. More precisely, let F be a finite field and K = F(x). 
Then every form of degree d in more than d 2 variables has a nontrivial zero 
in K. The proof makes ingenious use of the theorem of Chevalley proved in 
this chapter. It is a special case of a general result of S. Lang. 

Many of the most important advances in number theory demand an 
extensive knowledge of modern algebraic geometry. For a readable and not 
too sophisticated introduction to algebraic geometry see W. Fulton [135]. 
A more extensive introduction is contained in Shafarevich [219]. Finally, 
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for a reader with more background in commutative algebra, see R. Hart- 
shorne [144]. 


Exercises 

1. If K is an infinite field and/x 2 ,..., xjisa non-zero polynomial with coefficients 
in X, show that/is not identically zero on A n (K). (Hint: Imitate the proof of Lemma 
1 in Section 2.) 

2. In Section 1 it was asserted that H, the hyperplane at infinity in P n (F), has the 
structure of P n ~ 1 (F). Verify this by constructing a one-to-one, onto map from 
P n -\F) to H. 

3. Suppose that F has q elements. Use the decomposition of P n (F) into finite points and 
points at infinity to give another proof of the formula for the number of points in 
P n (F). 


4. The hypersurface defined by a homogeneous polynomial of degree 1, a 0 x 0 4- 
a l x 1 -f a 2 x 2 + … + a n x n , is called a hyperplane. Show that any hyperplane in 
P n (F) has the same number of elements as P n — 1 (F). 

5. Let / (x 0 , x u x 2 ) be a homogeneous polynomial of degree n in F[x 0 , x ls x 2 ]. 
Suppose that not every zero of a 0 x 0 -f a l x l -f a 2 x 2 is a zero of / Prove that 
there are at most n common zeros of / and a 0 x 0 + a i x l -f a 2 x 2 in P 2 (F). In more 
geometric language this says that a curve of degree n and a line have at most n 
points in common unless the line is contained in the curve. 


6. Let F be a field with q elements. Let M n (F) be the set of n x n matrices with co¬ 

efficients in F. Let Sl n (F) be the subset of those matrices with determinant equal to 
one. Show that Sl n (F) can be considered as a hypersurface in A n2 (F). Find a formula 
for the number of points on this hypersurface. ^Answer : (q — — l)(q n — 分 ) •. • 

(m 

7. Let / e F[x 0 , x 2 ,..., xj. One can define the partial derivatives df/dx 0 , 

df/dx x , … ， df/dx n in a formal way. Suppose that / is homogeneous of degree m. 
Prove that x^dfldx^) = mf. This result is due to Euler. (Hint: Do it first for 
the case that / is a monomial) 

8. (continuation) If / is homogeneous, a point a on the hypersurface defined by / 
is said to be singular if it is simultaneously a zero of all the partial derivatives of/. 
If the degree of / is prime to the characteristic, show that a common zero of all the 
partial derivatives of / is automatically a zero of /. 


9. If m is prime to the characteristic of F, show that the hypersurface defined by 
a 0 Xo -f a { x^ + … + a n x„ has no singular points. 

10. A point on an affine hypersurface is said to be singular if the corresponding point 
on the projective closure is singular. Show that this is equivalent to the following 
definition. Let / gF[x 1? x 2 ,... ， xj, not necessarily homogeneous, and aeH f (F). 
Then a is singular iff it is a common zero of df/dxi for i — 1 ， 2,… ， n. 

11. Show that the origin is a singular point on the curve defined by 少 2 - x 3 = 0. 
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12. Show that the affine curve defined by x 2 + y 2 + x 2 y 2 = 0 has two points at 
infinity and that both are singular. 


13. Suppose that the characteristic of F is not 2, and consider the curve defined by 
ax 2 + bxy -f cy 2 = 1, where a ， b，c e F*. If b 2 — 4ac • F 2 , show that there are no 
points at infinity in P 2 (F). If b 2 — 4ac e F 2 , show that there are one or two points 
at infinity depending on whether b 2 — 4ac is zero. If b 2 — 4ac = 0, show that the 
point at infinity is singular. 

14. Consider the curve defined by y 2 = x 3 + ax + b. Show that it has no singular 
points (finite or infinite) if 4a 3 -f 21b 2 ^ 0. 

15. Let Q be the field of rational numbers and p a prime. Show that the form x n 0 +l 4- 

-f p 2 x n 2 +l + ... + p n x n n + l has no zeros in P n (Q). (Hint: If a is a zero, one 
can assume that the components of a are integers and that they are not all divisible 
byp.) 

16. Show by explicit calculation that every cubic form in two variables over Z/2Z has a 
nontrivial zero. 

17. Show that for each m > 0 and finite field F q there is a form of degree m in m variables 

with no nontrivial zero. [Hint : Let co 2 ， ... ， be a basis for F q m over F q and 
show that f(x u x 2 ,..., x m ) = + ... + co^ 1 x m ) has the required 

properties.] 

18. Let g u 分 2 , •.. ， e 义 2 , ... ，义 ”] be homogeneous polynomials of degree 

d and assume that n > md. Prove that there is nontrivial common zero. [Hint: 
Let / be as in Exercise 17 and consider the polynomial f{gi{x u ..., x n \..., 

分 m(^l，. . • ， X”)).] 

19. Characterize those extensions F pn of F p that are such that the trace is identically 
zero on F p . 

20. Show that if ae F q has trace zero, then a = j? — for some p e F q . 


21. Let ^ be a map from F q to C* such that ip((x -f j9) = for all cc，P e F q . Show 

that there isay e F q such that [p(x) = for all x g F q , where C = e 2ni,p . 

22. If g a (x) i s a Gauss sum on F, defined in Section 3, show that 

(a) g a (x) = x(^)g(xy 一 

(b) g(x~ 1 ) = g(x) = x(—i) 丽 
(C) \g x (x) \ = q l/2 . 

⑹ g(x)d(x~ l ) = 


23. Suppose that /is a function mapping F to C. Define f(s) = (l/q) ^ and 

prove that f(t) = [ s /( ⑽⑻. The last sum is called the finite Fourier series 
expansion of /. 


24. In Exercise 23 take/to be a nontrivial character % and show that x(s) = (l/^)g~ s (x)- 
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The zeta function of an algebraic variety has played a 
major role in recent developments in diophantine geometry. 

In 1924 E. Artin introduced the notion of a zeta function 
for a certain class of curves defined over a finite field. By 
analogy with the classical Riemann zeta function he con¬ 
jectured that the Riemann hypothesis was valid for the 
functions he had defined. In special cases he was able to 
prove this. Remarkably，results of this nature can already 
be found in the work of Gauss (naturally. Gauss stated his 
results differently from Artin). Weil was able to prove (in 
1948) that the Riemann hypothesis for nonsingular curves 
over a finite field was true in general. 

In 1949 Weil published a paper in the Bulletin of the 
American Mathematical Society entitled “ Numbers 
of Solutions of Equations over Finite Fields.” In this paper 
he defined the zeta function of an algebraic variety and 
announced a number of conjectures. Weil had already 
proved the validity of his conjectures for curves. Here he 
establishes the same results for a class of projective hyper¬ 
surfaces. We shall give an exposition of part of this 
material. Most of the necessary tools have already been 
developed. The main new result that is needed is the 
Hasse—Davenport relation between Gauss sums. Weil gave 
a proof of this relation that is substantially simpler than 
the original. We shall give a proof due to P. Monsky that 
is even simpler than Weil’s, although it is far from trivial. 

In 1973 Pierre Deligne succeeded in establishing the 
validity of the Weil conjectures in all generality. The 
proof utilizes the most advanced techniques of modern 
algebraic geometry and represents one of the most re¬ 
markable mathematical achievements of this century. 


§1 The Zeta Function of a Projective Hyper surface 

In Chapter 7, Section 2, we showed that if F = Z/pZ and s > 1 an integer, 
then there exists a field K containing F with p s elements. The same result 
holds true in general. Namely, if F is a finite field with q elements and s > 1 
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an integer, then there exists a field F s containing F with q s elements (this is 
F qs in our former terminology). The proof of the general case is almost 
identical with that of the special case (see the Exercises to Chapter 7). 

Now, let / (y) e Fly 0 ,y u ..., yj be a homogeneous polynomial and 
let N s be the number of points on the projective hypersurface H f (F s ) c= 
P n (F s ). In less fancy language, N s is the number of zeros of / in P n (F s ). We 
wish to investigate the way in which the numbers N s depend on s. 

At the end of this section we shall prove that the number N s depends 
only on s and not on the field F s . This will follow once we show that any 
two fields containing F and of the same dimension over F are isomorphic. 

To study the numbers N s we introduce the power series N s u s . 

In all that follows it is possible to deal only with formal power series and thus 
to avoid all questions of convergence. To those who are uncomfortable with 
that notion, notice that N s < (q s(n+ — l)/(q s — l) < (n -\)q sn . It follows 
that our series converges for all complex numbers u such that \u\ < q~ n 
and defines an analytic function in that disc. 

Let exp u = [ 二 0 (l/s\)u s . 

Definition. The zeta function of the hypersurface defined by/is the series given 
by 

7 … /v N ^\ 

Zf(u) — expf 2] ~~~~~)• 


It is possible to regard Z f (u) either as a formal power series or as a function 
of a complex variable defined and analytic on the disc {ueC\\u\ < q~ n }. 

It may seem strange to deal with Z f (u) instead of directly considering 
the series N s u s . The reasons are mainly historical, although as we shall 
see the zeta function is, in fact, easier to handle. See the remarks at the end of 
this section. 

As a first example, consider the hyperplane at infinity. By definition 
this is the set of points [a 0 ,..., a„] e P n (F) with a 0 = 0. It is defined by the 
equation x 0 = 0. As we pointed out in Chapter 10 it is easy to see that H Xo (F s ) 
has the same number of points as P n ~ 1 (F s ); that is, 

N s = 产 ” + q s(n ~ 2) + ••• + + 1. 

It follows that 


交 N s u s 


s 




M — 1 

=-[ln(l - q m u). 

m = 0 


⑴ 


We have used the identity W V S = — ln(l — w). Exponentiating 

Equation (1) yields 

Z Xo (u) = (1 - ^ _ 1 1/) _1 (1 - q n ~ 2 u )~ 1 ... (1 — quy^l - u)~K 
In particular, we see that Z Xo (u) is a rational function of u. 
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We shall now compute a somewhat more involved example. Consider 
the hypersurface defined by — yl + y\ y\ ^ yl = ^ - To compute N x 
we use Theorem 2 of Chapter 10. Specializing to our case we find that 

^ + 1 + Z(-l) - 0(X ) 4 ， 

q 

where x is the character of order 2 on F. We know that g(x ) 2 = x( - 1 )^- 
Thus 

N 1 =q 2 + q+ \+ 

To compute N s we must replace q by q s and x by ， the character of order 2 
on F s . Then 

N s = q 2s + q s + \ + Xs(~ W- 

If — 1 is a square in F, then / s ( — 1 ) = 1 for all s. If — 1 is not a square 
in F, it is not hard to see that ^ s ( — 1 ) = — 1 for s odd and Zs( — 1 ) = 1 for s 
even. 

In the first case 

f N s u s _ " (q 2 u) s ( - (quT + y^_ 

s = 1 S s = 1 S S=1 S 5 = 1 $ 

and so 

Z(u) = (1 — q 2 u)~ 1 (l — qu)~ 2 (l — m) _1 . 

In the second case the last term gives rise to the sum 


Thus in this case 



—— ln(l + 


Z(u) = (1 — q 2 u)~ 1 (l —分 w) _1 (l + qu)~ l (l — w) - 1 . 

Notice that in the first case the zeta function has a pole at u = q— 1 
of order 2 , whereas in the second case there is a pole at w = g — 1 of order 1 . 
This is in accordance with a conjecture of John Tate, which relates the order 
of the pole at u = q~ 1 to certain geometric properties of the hypersurface. 
We cannot go more deeply into this here. 

As a final example, consider the curve Jo + + ^2 = 0 over F = Z/pZ, 

p is a prime congruent to 1 modulo 3. 

Specializing Theorem 2 of Chapter 10 once again we find that 

= p + 1 + 1 g( X ) 3 + 1 g( X 2 )\ 

Here ^ is a cubic character on Z/pZ. We know that g(x) 3 = pn ，where 
n = J(x 9 xX and 7 C 元 =p. Thus 

J /V 1 =/7 + l+ 7T + 元 . 
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It will follow from the Hasse-Davenport relation, to be proved later, 
that 


N s = p s \ — ( —7r) s — ( —7r) s . 
Calculation now shows that 


Zf(u)= 


(1 + nu)(l + nu) 
(1 — w)(l — pu) 


The numerator can be evaluated explicitly. In Chapter 8 we proved 
that 丌 + 元 = X， where A is uniquely determined by 4p : =A 2 -h 21B 2 
and A = l (3). 

So our final expression is 

l Au pu 2 
(1 — w)(l _ pu) 

In this example Z f {u) is a rational function; the numerator and de¬ 
nominator are polynomials with integer coefficients. The roots of Z f (u\ 
— n— 1 and — 元 _1 , both have absolute value p~ 1/2 . 

More generally, let /(x 0 , x 1? x 2 ) g F[x 0 , x l9 x 2 ] be a nonzero homo¬ 
geneous polynomial that is nonsingular over every algebraic extension of F. 
Then, Weil was able to prove that the zeta function of / has the form 

P(u) 

(1 — w)(l — qu) ? 

where P(u) is a polynomial with integer coefficients of degree (d — l)(d — 2), 
d being the degree of / Furthermore, if a is a root of P(u), then |a| = q~ 1/2 . 
The last statement is called the Riemann hypothesis for curves. 

[To see the relation with the classical Riemann hypothesis, make the 
change of variables w = q~ s andsetC f (s) = Z f (q ~ s ). C f(s) is directly analogous 
to the classical zeta function (see the end of this section). The condition that 
the roots oiZ f {u) have absolute value 1/2 is equivalent to the condition that 
the roots of C/(s) have real part 士 .] 

In all our examples the zeta function is rational. In 1959 B. Dwork 
proved that any algebraic set has a rational zeta function [26]. His proof is 
extremely beautiful, but unfortunately it is based on methods that are 
beyond the scope of this book. 

Our examples suggest another characterization of the condition that the 
zeta function is rational. 

It is immediate from the definition of the zeta function that if it is ex¬ 
panded in a power series about the origin, then the constant term is 1 . 
Consequently, if Z f (u) = P(u)/Q(u), where P(u) and Q(u) are polynomials, 
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we may assume that P(0) = Q(0) = 1 (prove it). With this assumption, the 
zeta function can be factored as follows: 


Z / ⑻ 


where oc h ^ e C. We can now prove 


EL. (1 — a i u ) 

EL (上 — 


Proposition 11.1.1. The zeta function is rational iff there exist complex numbers 
(Xi and jij such that 

N s = U s j-Y^. 

j i 

Proof. Suppose that the zeta function is rational. Then by the above remarks 


Z(u) 


n，(i - a i u ) 

rL g _ A/ w ) 


with (x h Pj^C. Taking the logarithmic derivative of both sides: 


Z\u) 

~Z(u) 




-pi 


pj u 


Multiply both sides by u and then use the geometric series to expand 
in a power series. One finds finally that 

uZ\u) 


Z ⑻ 




( 2 ) 


We now compute the left-hand side in a different way. From the definition 

Nm s 


Z(w) = exp $ 


s 


Differentiate logarithmically both sides and then multiply both sides by u. 
We find that 


uZ\u) 

Z(u) 


y n k u s . 


(3) 


Comparing coefficients of u s in Equations (2) and (3) we have 

j i 

The converse is an easy calculation that we have done in special cases. 
We leave the details to the reader. □ 


It remains to prove that the number N s is independent of the choice of 
field F s . The reader may wish to simply accept this fact and proceed to 
Section 2. 

Suppose that E and E are two fields containing F both with q s elements. 
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Proposition 11.1.2. E and E are isomorphic over F; i.e” there exists a map 
<7: E — E’ such that 


(a) a is one to one and onto. 

(b) a(a) = a for all a e F. 

(c) g{ol + j 8 ) = ( 7 (a) + a(P) for all a, e £. 

(d) <r(aj 8 ) = (t((x)(t(P) for all a, jS e £. 


Proof. We shall show that both £ and E are isomorphic over F to F[x]/ (/ (x)) 
for some irreducible polynomial f(x) e F[x]. 

To begin with there is an cl' g E such that E = F(a') (for example, take 
a' to be a primitive q s — \ root of unity). Let / (x) e F[x] be the monic 
irreducible polynomial for a'. Then E ^ F[x]/ (/ (x)). Since a' satisfies 
x^ s — x = 0 we have f(x)\x qS — x. 

Since E has q s elements we have x qS — x = YloceE ( x ~ a )- ^ follows that 
/(a) = 0 for some cue E. 

Thus F(a) ^ F[x]/ (/ (x)) is a subfield of E with q s elements. One con¬ 
cludes that E = F(a) ^ F[x]/(/(x)) ^ F(a r ) = F. □ 

We can now use the isomorphism o to induce a map a from P\E) to 
P\E). Namely, 

珂 Oo, • • • ， aj) = Wa 0 ),. • • ， o-(a n )] 

d is one to one and onto. Moreover, if /(y 。， h ， … ， y„) e F[y 0 , , y„] 

and we restrict d to the projective hypersurface H f (E), it maps onto the 
projective hypersurface H /(£'). This proves the independence of the numbers 
N s from the choice of field F s . We leave the details to the reader. 

We conclude this section with a discussion of the analogy between the 
congruence zeta function and the Riemann zeta function. 

The Riemann zeta function ^(s) = n~ s may be written, by the 
fundamental theorem of arithmetic as an infinite product Y\ P (1 — P s )— 1 
the product being over all prime numbers p (see Exercise 25, Chapter 2). 
We will establish an analogous infinite product for Z f (u) the product being 
over certain objects called the prime divisors of the underlying algebraic 
set. This will be done with a minimum of technical language from algebraic 
geometry. 

If F is a finite field with q elements consider any algebraic set V in A n (F). 
Then we may define as in Section 1 the zeta function of V over F as 


exp 



N s u s 

s 



where N s is the number of points in A n (F q -) satisfying the equations defining 
V. We consider an affine algebraic set rather than a projective algebraic set to 
simplify the discussion. Furthermore it is convenient to have a single field 
K zd F which is algebraic over F and contains an extension of degrees s 
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over F for every integer s > 1. It follows easily from Proposition 7.1.1 that K 
then contains precisely one field with q s elements. A simple construction for 
such a field K is given in the Exercises. This field is uniquely determined up to 
isomorphism and is called an algebraic closure of F. We may then consider 
A n (K) and extend V to be an algebraic set still denoted by V in A n (K) with 
N s points whose coordinates are in iV. 

If a = (a 1? a 2 ,..., e K let F qd be the smallest field containing F and 
a u a 2 ,^ •, a n . We say that a is a point of degree d. Since a q = a{ov a s F 
follows that the points (x, oc q 9 tx q ,... ,a q ~ are also in V where the exponent 
denotes raising each coordinate to the indicated power. Furthermore these 
points are distinct (by say, the corollary to Proposition 7.1.1). 

Definition. A prime divisor on F is a set of the form {cc qJ | j = 0, 1 ， 2, ... ， d — 1} 
where a is a point on V of degree d. This is somewhat at variance with 
common usage. What we call a prime divisor is usually referred to as a prime 
zero cycle defined over F. 

Prime divisors are traditionally denoted by The degree of 屯 ， deg ^5, 
is d. 

The prime divisors clearly partition V a A n (K). Furthermore if a is a 
point on V with coordinates in F qs then a defines a unique prime divisor 圯 
of degree d for some d\sby Proposition 7.1.5. This prime divisor contains d 
points on V each with coordinates in F qs . If we define a d to be the number of 
prime divisors on V of degree d (a number which is finite) then we have by the 
above the following important result. 

Lemma 1. N s = ^ djs da d . 

The main result of this section may now be stated. 

Proposition 11.1.3. Z v (u) = ^ (1/(1 - w deg ’). 

Proof. The right-hand side is clearly 



The logarithmic derivative of this expression is 

1 ^ na n u n 
u n = i 1 — u n 

Expanding the denominator into a geometric series and computing the 
coefficient of t m we obtain 
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which by Lemma 1 becomes 

IN〆 1 - 1 . 

m= 1 

Integrating and taking the exponential gives the result. □ 

This result shows that Z(u) has integral coefficients. The analogy with the 
Riemann zeta function becomes even more striking if one introduces a new 
variable s related to w by w = q~ s . Then we have 

z(q ~ s)= y (i — q 1 -^) 

= y (i - (i 卿) ，） 
in perfect analogy with the Riemann zeta function. 

§2 Trace and Norm in Finite Fields 

In Chapter 10, Section 3, we introduce the notion of trace. Here we shall 
generalize that notion and also define the norm in finite fields. 

Let F be a finite field with q elements and E a field containing F with q s 
elements. 

Definition. If a e £, the trace of a from £ to F is given by 

tr £/f (a) = a + + • • • -h (x qS ~\ 

The norm of a from £ to F is given by 

N e/f (oc) = a • a g - • • cn qS ~\ 

The following two propositions describe the basic properties of trace and 
norm. 

Proposition 11.2.1. //a, P e E and a e F, then 

(a) tr £/F (a) g F. 

(b) tr £/F (a + = tr £/F (a) -h tr E/F (/]). 

(c) tT E ^ F (acc) = a \x E j F {pL). 

(d) ix EjF maps E onto F. 

Proposition 11.2.2. If a, jS e £ and aeF, then 

⑻ N e/f (oc) g F. 

(b) N E j F (ap) = Ne/f^)^e/f(P)- 
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(c) N E /p(ckx) = a s N £//?(oc). 

(d) N E!F maps E* onto F*. 

Proof. The proof of Proposition 11.2.1 is exactly analogous to that of 
Proposition 10.3.1 and will be omitted. 

To prove Proposition 11.2.2 notice that 

N E/F {oc) q = (oc oc q …… (x qS ~ l ) q = oc q • oc q2 …… a# = N e/f (ol). 

Thus N e/f (oc) e F. 

Now, 

N e/f (ocP) = (aj?) • W ••… （ a，— 1 

=(a • a 4 …… oc qS ， . （ P. p q …… jT _1 ) 

~ NE/F(a 、 N E j F (P). 

This proves step (b). 

To prove step (c) notice that for a e F, N E/F (a) = a - a q . a qS ~ l = a s 

since a q = a. Now apply the result of step (b). 

Finally, consider the kernel of the homomorphism N E/F , i.e., the set of all 
oleE such that N E/F (oc) = 1. a is in the kernel iff 

1 = = a i+ 針 … = a (9 s -D/u-i) 

Since ( 矿一 l)/(q — l)\q s — 1 we have by Proposition 7.1.2 thatx ( ^~ 1)/(q ~ 1} 
=1 has (q s — l)/(q — 1) solutions in E. By elementary group theory it follows 
that the image, N e/f (E*), has q — 1 elements, but this is exactly the number of 
elements in F*. Thus N E/F is onto. □ 

Given a tower of fields F cz E cz K we have the relation [K: F]= 
IK : £][£ : F]. This result is easy to prove in general. If all three fields are 
finite, we can prove it as follows. Let q be the number of elements in F. 
Then the number of elements in E and K are q lE:F] and q [K:F \ respectively. 
Considering K as an extension of E we can express the number of elements in 
Kas (^ [£:F1 ) [K:EJ . Thus 

n [K:F] _ n [E:F][K:E] 

H — H 

and therefore [K : F] = [£: F][X : £]. 

We can now prove another simple property of trace and norm that will 
be useful. 

Proposition 11.2.3. Let F c E c K be three finite fields and ole K. Then 

⑻ tr K/F (a) = tr £/F (tr K/ £ ； (a)). 

(b) N K i F ((x) = Ne/f(^ x/£：( a ))- 

Proof. We shall prove only property (a). The proof of property (b) is similar. 

Let d = [E\ F], m = [^K : E\ and n = [K: F]. As we have pointed out 
above, n = dm. 
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The number of elements in E is q x = q d . Thus 

tr K/E ((x) = a + a 91 + ... + 

and 

d -1 

t r £/F(t r K/E( a )) = Z tr K/f ： (a ) 9 

1 = 0 

d — 1 m — 1 

=I Z 

/ = 0 j=0 
1 m — 1 

=I Za^ + i 

i = 0 j=0 
n— 1 

=X〆 

k = 0 

= tr x/F (a). 

We have used the fact that as j varies from zero to m — 1 and i varies 
from zero to d — l the quantity dj + i varies from zero to md — l = n — l. 

□ 

Suppose now that F cz K are finite fields, n = [K: F], and ae K. Let 
E = F(a) and f(x) e F[x] be the minimal polynomial for a over F. By 
the Proposition 7.2.2 we have [£: F] = d, where d is the degree of 

/W- 


Proposition 11.2.4. Write f(x) = x d — c 1 x d ~ 1 -f • • • + ( — l) J c d . Then 

⑻ /(x) = (x — a)(x — a” … (x — a qd ' x ). 

(b) tr x/F (a) = {nld)c v 

(c) N K / F ((x) = c n J d . 

Proof. Since the coefficients off satisfy a q = a wq have 

0 = f ⑽ = / ⑽. 

Thus (x q is a root of/. Similarly, 

0 = f(oc q y = /(〆)• 

Thus aMs a root of f. Continuing in this manner we see that a, a' a 9 ， ... ， 
a qd _ l are all roots of f. If we can show that all these roots are distinct, assertion 
(a) will follow. 

Suppose that 0 < i < j < d and that oi ql = (x qJ . Set k = j — i. We shall 
show that k = 0. 

We have 

a qt = oc qJ = ((x qk ) q \ 


which implies that 


(a - (x qk ) qi = 0 
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and so 

OL = (X qk . 

Since f(x) is the minimal polynomial for a it follows that f(x) divides 
x qk — x and so by Theorem 2 of Chapter 7 we have d\k. However, 0 < k < d 
and so k = 0 and we are done. 

It follows immediately from assertion (a) that q = tr £/F (a) and that 

c d — N E IF(0C). 

Since as E = F(oc) we have tr K/£ (a) = [X : £]a = (n/d)(x and N K/E ((x) 

= 0L n ^ d . 

By Proposition 11.2.3, 

( n \ n n 

] = ] tr £/F (a) = j c 1 . 

Similarly, 

^K/f( a ) = N E i F {N K i E {cC]i) = N E ip(oc n/d ) = N E i F (oc) n/d = c n J d . 匚 


§3 The Rationality of the Zeta Function Associated to 

a 0 XQ + + … + a n x^ 

Let f(x 0 , x 1? ... ,x„) be the polynomial given in the title of this section 
[notice that this is not the / (x) of Section 2]. Suppose that the coefficients 
are in F, a finite field, with q elements and that q = 1 (m). We have to in¬ 
vestigate the number N s of elements in H f (F s ), where [F s : F] = s. Theorem 2 
of Chapter 10 shows that N s is given by 

产 1) + q s(n-2) + … + 矿 + i 

+ ~ s Z WW 1 ) … XnK^ln 1 )g(x { o) - - - 沒 0d s )) ，（ 4) 

q 咕 ) ，…， 

where q s is the number of elements in F s , and the x { i } are multiplicative 
characters of F s such that x { i )m = X { P ^ £, and Zo } Zi s) * * * Xn } = 

We must analyze the terms x { i\ a i x ) and To do this we first relate 

characters of F s to characters of F. 

Let x be a character of F and set x' = 1° N FslF ; i.e” for a e F s , /’(a)= 
Z(iV Fs/F (a)). Then one sees, using Proposition 11.2.2, that / is a character of 
F s ，and moreover that 

(a) x ^ P implies that / / p'. 

(b) x m = 8 implies that x m = e - 

(c) ⑷ =z ⑷ s for all ae F. 
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It follows easily that as x varies over the characters of F of order dividing 
m, x' varies over the characters of F s of order dividing m. 

The sum in Equation (4) can now be rewritten as 

Z X 0 (O S … Irf^n l ) S g{y!o) - - - 9(Xnl (5) 

XOy •••» Xn 

where y 0 ,... ， are characters of F satisfying xT = ^ Xi ^ and XoXi - 'In 
=£. 

It remains to analyze the Gauss sums g(x \ This is the content of the 
following theorem of Hasse and Davenport (see [23]). 

Theorem 1. (~g(x)) s = — g(xl. 

We postpone the proof of this relation. Using Theorem 1 and Equations 
(4) and (5) we see that N s is given by 

Z + (-l) n+1 Yj — lo(a~ { ) ••- Xn(an l )g(Xo)-' D ， 

k = 0 Xo,Xu .. ,Xr.L 4 一 

⑹ 

where the second sum is restricted by the same conditions as Equation (5). 
Applying Proposition 11.1.1 gives us the main result of this chapter. 

Theorem 2. Let a 0 , a l9 ... ,a n e F*, where F is a finite field with q elements, and 
q = 1 (m). Let /(x 0 ,..., x„) = a 0 Xq + a x x^ + .. • + a n x^. Then the zeta 
function Z f (u) is a rational function of the form 

P(u) { ~ 1)n 

(1 - w)(l - 卯) ...（1 - q n ~ l uY 
where P(u) is the polynomial 

I! (l - (-l) w+1 \ Zo(«o ” … Xn{a~ x )g{Xo)g{h) - - - 9(Xn>\ 

XOj Xly • • • > Xn \ 4 / 

the (n + 1)-tuples % 0 , %i,. •., being subject to the conditions x7 = Xi ^ 
and XoXi-'Xn = £• 

A number of remarks are in order: 

(1) The degree of P(u) can be computed explicitly. It is 

m 一 1 [(m - l) n+1 + (-l)” +1 (m — 1)]. 

(2) Since \g(x)\ = q 1/2 it follows from the explicit expression for P(u) that 
the zeros of Z f {u) have absolute value q ~ i{n ~ 1)/2) . This is in accord with 
the general Riemann hypothesis. 
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(3) If we write P(u ) : = n(i — aw), then numbers a are algebraic integers. 
This is not hard to see. Each a has the form 

C- g(xo) - - - g(Xnl 

q 

where C is a root of unity and XoZi * * * = £ - Using Corollary 1 to 

Theorem 3 of Chapter 8 we see that 


-g(xo)g(xi) -' Q(x n ) = lXH, ... ， x n -il 
q 

The Jacobi sum is a sum of roots of unity and so is an algebraic integer. 
Thus a = CXn(-^)J(Xo ， Zi ， …， 1) is an algebraic integer as well. 

Let /(x 0 , x 1? ... , x n ) be a homogeneous form of degree d with coefficients 
in a finite field F. Assume furthermore that the partial derivatives f XQ ， … ， f Xn 
have no common projective zero in any algebraic extension of F. In this 
case we say that the projective hypersurface defined by / is absolutely non- 
singular. Then one may consider the zeta function Z(t) of the hypersurface, 
f = 0. In this case the Weil conjectures (now theorems) state the following: 

(a) Z(t) is a rational function which can be written as 


Z(0 


P(t) 


(-D n 


(i - o(i - • • • (1 - 


where P(t) is a polynomial with integer coefficients. 

(b) P(t) = (1 - a x t)(l 一 a 2 r) … （1 — d m 0* The mapping a^q n ~ l /a is a 
bijection of the set of reciprocal roots a 1? ... ,a m . 

(c) \a ( \ = q (n ~ l)/2 . 

(d) The degree of P(t) isd' 1 ^ - 1)” +1 + (- 1)" +1 (3 - 1)]. 

The statement regarding the absolute value of the roots is known as the 
Riemann hypothesis for the hypersurface. The proof of (a), (b), and (d) for a 
general hypersurface is due to B. Dwork [26]. The proof of the Riemann 
hypothesis is due to P. Deligne (1973). For the general statement of the Weil 
conjectures we refer the reader to Weil [80] and Katz [161]. 


§4 A Proof of the Hasse-Davenport Relation 


Let F be a finite field with q elements and F s be a field containing F such 
that [F s : F] = 5 . Let % be a nontrivial multiplicative character of F and 
/ = X° / is a character of F s . We wish to compare the Gauss sums 
g(x) and g(x’). 
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Let us recall the definition of g(j) (see Chapter 10, Section 3) : 

g(x) = Z 

teF 

where ^(0 is equal to Cp {t) - The trace function in this definition coincides 
with the function tr F/Fp introduced in this chapter. Since we are considering 
more than one field, it is important to attach subscripts to tr. Now, 

g(x f ) = Z /(0 _， 

teF s 

where = C trjrs/Fp(0 . Since tr Fs/Fp (t) = tr F/fp (tr Fs/F ⑴） it follows that 

[J/ = \J/ o tTp s if . 

For a monic polynomial / (x) = x n — c 1 x n 1 + . " + (-仇 in F[x] 
define A(/) = 

Lemma 1. 义 (/ 沒） = KDKq) f or ^onic f，g e F[x]. 

Proof. If g(x) = x m — b 1 x m ~ 1 + … + ( — l) m b m9 then f(x)g(x) = x n+m — 
(^i + c 1 )x n + m ~ 1 + … + (-l) n + m b m c n .Thus^(fg) = -i- c r ). x{b m c n ) = 

Hbi^ic^xibJxiCn)= 树 hMbM(Ci]X( c n ) = □ 

Lemma 2. Let oc e F s and f (x) be the monic irreducible polynomial for a over F. 
Then 

Kf) s,d = /(a)W(a )， where d = deg /. 

Proof. This result follows easily from Proposition 11.2.4. Namely, if f(x)= 
x d — c x x d ~ 1 + … + ( — l) d c d , then 

tr Fs/F (a) = 2 c i and n f s /f(^) = cf. 

Now, X{f) = HcMcd), so 

Kf) s/d = 棒 J /d X(c d ) s/d = 

= 少 (tr Fs/F (a))x(iV Fs/F (a))= 少 ’ (a)x'(a). □ 

Lemma 3. g(x f ) = [ (deg/)A (/) s/deg/ ，where the sum is over all monic ir¬ 
reducible polynomials of F[x] with degree dividing s. 

Proof. According to Theorem 1 of Chapter 7—generalized to F as base 
field—— x is the product of all monic irreducible polynomials of degree 
dividing s. It follows that every such irreducible polynomial has all its roots 
in F s and conversely that every element in F s satisfies such a polynomial. 
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Let / (x) be monic irreducible of degree d\s. Let a x , a 2 ,..., e F s be its 
roots. Then by Lemma 2 

Z = dX{f) s ' d . 

i = i 

Summing over all polynomials of the required type yields the result. □ 

We are now in a position to prove the Hasse-Davenport relation. The 
proof is based on the following identity: 

I Kf)t degf ^rK 1 - Kf)t degf r\ ⑺ 

f f 

where the sum is over all monic polynomials and the product is over all 
monic irreducible polynomials in F[x]. 

The identity is proved by expanding each term (1 — A(/)r deg/ ) 一 1 in a 
geometric series and using the fact that every monic polynomial can be written 
as the product of monic irreducible polynomials in a unique way. The 
details are left as an exercise. 

Now, 

^x(fr^= f ( x A(/)y. 

f S = 0 \deg/ = s / 

We define A(l) = 1, as this is necessary for Equation (7) to hold. For s = 1 
we have 

Z Kf) = X A(x - a) = X = g(j). 

deg / = 1 aeF aeF 

For s > 1 we have 

I Kf) = I K^ s - c〆— 1 + ... + (-i) s c s ) 

deg/ = s CfeF 

= q s ~ 2 Z = q s ~ 2 (^ ^( c s))(z ^( c i)) = °- 

Putting all this together we see that the left-hand side of Equation (7) reduces 
to 1 + g(x)t. Using this, take the logarithm of both sides of Equation (7), 
differentiate, and multiply both sides of the result by t. This yields 

g(x)t — ▽ A(/Xdeg f)t^ 

i + g(x)t ' y i-W eg/ * 

Expand the denominators in geometric series. Then 

f (-ir^(z)^ = lfz (deg/M(/)r de «A 

S=l f V=1 / 
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Equating the coefficients of f yields 

(-1) S_1 ^(Z) S = Z (deg/M(/) s/deg/ . 

deg/|s 

By Lemma 3, the right-hand side is g{^\ This completes the proof. 口 


§5 The Last Entry 

The last entry of Gauss’s mathematical diary is a statement of the following 
remarkable conjecture: 

Suppose that p 三 1 (4). Then the number of solutions to the congruence 
x 2 + y 2 -j- x 2 y 2 = 1 (p) is p + 1 — 2a, where p = a 2 b 2 and a + bi 
=1 (2 + 2i). 

Some explanation is in order. If p = 1 (4)，then by Proposition 8.3.1 
we know that p = a 2 + b 2 for some integers a and b. If we choose a odd and b 
even, then a and b are uniquely determined up to sign. The congruence 
a + bf = 1 (2 + 2i) determines the sign of a. We shall give a simpler formula¬ 
tion of this. 

Lemma. If p = l (4), p = a 2 + b 2 , and a + br = 1 (2 4 - 2f), then a is odd and 

b is even. Moreover, if 4\b, then a = l (4), and if then a = —l (4). 

Proof, a + bi = 1 (2 + 2i) implies that a bi = 1 (2) and so a is odd and b 
even. 

Since 4 = —2(i — l)(i + 1) it follows that if 4!b ，then a -h bi = a = 1 
(2 + 20. Taking conjugates a = 1 (2 — 2i). Thus (2 4- 2i)(2 — 2i) = 8|(a — l) 2 
and a = l (4). 

If 4 氺 b, then b = 4/c + 2 for some k. Thus a + bi = a 2i = \ (2 + 2i). 

Since 2i = — 2 (2 + 2i) we have a = 3 = — 1 (2 + 2i). As before 8|(a + l) 2 

and so a = — 1 (4). □ 

Theorem. Consider the curve C determined byx 2 t 2 + y 2 t 2 + x 2 y 2 — trover F p9 
where p 三 1 (4). Write p = a 2 + b 2 with a odd and b even. If 4\b, choose 
a = 1 (4); if choose a = —\ (4). Then the number of points on C in 
P 2 (F p ) is p — 1 — 2a. 

The zeta function of C is 

I — pu 

Before giving the proof a few remarks are in order. 

The answer p — \ — 2a differs from Gauss’ p 1 — 2a. The difficulty 
is that Gauss counts four points at infinity, whereas a simple calculation 
shows that [0, 1, 0] and [0, 0, 1] are the only points at infinity according to 
our definition. Thus our answer differs from his by 2. 



§5 The Last Entry 


167 


Since there are two points at infinity independently of p it suffices to 
count the number of finite points, i.e” the solutions to x 2 + y 2 + x 2 y 2 = 1. 

As an example take p = 5. Since 5 = l 2 + 2 2 we have 4J^b so we must 
take a = —1. The formula p — l — 2a gives the answer 6 in this case. 
Indeed, in addition to the two points at infinity, (1 ， 0 )，（一 1 ， 0) ，（ 0, 1)，and 
(0, — 1) are the other points on the curve in F p . 

The form of the zeta function may be surprising. The explanation is that 
the two points at infinity are singular. Thus the form of this zeta function is 
not in contradiction to our earlier observations. 

We now proceed to prove the theorem. Denote by C x the curve given by 
x 2 + y 2 + x 2 y 2 = 1 and by C 2 the curve given by w 2 = 1 — z 4 . We shall 
construct maps from C x to C 2 and from C 2 to C x . 

Notice that 

X 2 y 2 x 2 y 2 = 1 

implies that 

(1 + x 2 )y 2 = 1 — x 2 

and 

[(1 + x 2 )y] 2 = 1 - x 4 . 

Thus, if (a, b) is on C l9 then (a, (1 + a 2 )b) is on C 2 • Let 

取 y) = (X ， （1 + x 2 )y), 


X maps C x to C 2 . It is easy to see that this map is one to one. 
Now let 


"(z ， w)= 




"is not always defined. If oce F p is such that a 2 = — 1， then (a, 0) and 
(—a, 0) are on C 2 but p is undefined at these points. \x is defined at all other 
points of C 2 and maps these points to C 1 . It is easy to check that \x is inverse 
to X where it is defined. Thus 


N, = N 2 - 2, 

where N x and N 2 are the number of finite points in F p on C x and C 2 , re¬ 
spectively. 

We can compute N 2 by using Theorem 5 of Chapter 8. Specializing 
Theorem 5 to w 2 H- z 4 = 1 we see that 


n 2 = p + j(p, x) + J(p, x 2 ) + x 3 X 

where p is the character of order 2 and % is a character of order 4. 

Since % 2 = p, we have J(p, x 2 ) = J(P ， p) = — p( —1) = — 1. Also, since 
X 4 = e we have / = / so that J(p，/) = J(p, 刃 = J(p7x). 

Let 7i = — J(p, x)- Then 

N 2 = p— — 
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p takes on the values + 1 and x takes on the values ±1 ， ±i. Thus n = a + bi, 
where a, b g Z. Moreover | J(p, x)\ 2 = P so that a 2 + b 2 = 丌元 =p. It 
follows that N 2 = p — l — 2aandN 1 = p — 3 — 2a. Since C x has two points 
at infinity, the total number of points on C x in F p is given by 


N = p ~ 1 — 2a. 

By the lemma it suffices to prove that 7c = 1 (2 + 2i) in order to complete 
the proof of the first part of the theorem. This is accomplished by means of 
the following pretty calculation given in Hasse-Davenport [23]. 

Notice that p(a) — 1 = 0 (2) and that %(a) — 1 = 0 (1 + 0 for all a ^ 0 
in F p . The first assertion is obvious; the second follows from 1 — 1=0, 
— 1 _ 1 — — (1 — i)(l + i\ — i 一 1 — —(1 + 0, and i — 1 — i(l + i). Thus 
if a 7 ^ 0 and b 弄 0, (p(a) — l)(x(b) — 1) = 0 (2 + 2i). This congruence is 
trivially true for the pairs a = 0 9 b = 1 and a = l, b = 0. Therefore, 

I (p(a)-l)(x(b)-1)^0 (2-h2i). 

a+b=l 


Expanding we see that 

-ti - - Zp(a) + p = 0 (2 + 2i). 

b a 

The second and third terms are zero. Thus 


n = p = 1 (2 + 2i). 


The last step follows because p = 1 (4) by hypothesis, and 2 + 2i divides 4; 
indeed 4 = (1 — i)(2 2i). 

To calculate the zeta function it suffices to notice that by the Hasse— 
Davenport relation the number of points on x 2 t 2 + y 2 t 2 + x 2^2 _ t 4 j n 
P 2 (F pS ) is given by 

p s — 1 — (-j(p, x)Y - (-J(pVx)Y = p s - 1 - n s - n s . 

Thus 


^(w)= 


(1 — 7TM)(1 — 元 W) 
(1 - pw) 


(1 - W ) 


1 — 2au + pu 2 
(1 - pw) 


(1 



Notes 

As we have mentioned, in his thesis E. Artin [2] introduced the congruence 
zeta function. In that work he establishes the analog of the Riemann hy¬ 
pothesis for about 40 curves of the type y 2 = /(x), where / is a cubic or 
quartic polynomial. In 1934 Hasse proved that the result held in general for 
nonsingular cubics (the case of elliptic curves). The Riemann hypothesis for 
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arbitrary nonsingular curves was established in full generality by Weil in 
1948. His proof is far from elementary and uses deep techniques in algebraic 
geometry. 

Weil’s conjecture that the zeta function of any algebraic set is rational was 
proved in 1959 by B. Dwork using methods of p-adic analysis [26]. 

In 1969 S. A. Stepanov succeeded in giving an elementary proof of the 
Riemann hypothesis for curves [222]. A complete account of Stepanov’s 
method is given in the book by W. M. Schmidt, Equations over Finite Fields: 
An Elementary Approach [218]. This method was simplified further by E. 
Bombieri, who, using the Riemann-Roch theorem, gives a complete proof 
in five pages [98]. Sharper estimates in special cases have been obtained by 
H. Stark [221]. For an analysis of Deligne’s proof and an historical discussion 
of the entire issue the reader should consult N. Katz’s “ Overview of Deligne’s 
proof …” [161]. This paper also contains an extensive bibliography of the 
subject. See also the survey [248]. The discovery of these remarkable theorems 
is discussed by Weil in the first volume of his Collected Papers, [241], pp. 
568-569. Finally we mention the paper by J. R. Joly, “Equations et varietes 
algebriques sur un corps fini” [160]. 

Section 5 on Gauss’ conjecture is logically out of place since it could have 
been given in Chapter 8. We felt it was appropriate at this point since the 
relation between this conjecture and Weil’s Riemann hypothesis reveals once 
again the remarkable acuity of Gauss’ insight and how his imposing presence 
continues to make itself felt to this very day. 

A new edition of the mathematical journal of Gauss, translated from 
Latin to German, with an historical review by K. Biermann and comments 
by H. Wussing is now available [137]. This important historical document 
records the major discoveries of Gauss between the years 1796 and 1814. 
It is interesting to note that both the first entry (Section 11 of chapter 9) 
and the last entry are concerned with cyclotomy. For more biographical 
information on Gauss see T. Hall [143] and the recent biography by W. K. 
Biihler [101]. 


Exercises 

1. Suppose that we may write the power series 1 + a x u + a 2 w 2 + - •. as the quotient 
of two polynomials P(u)/Q(u). Show that we may assume that P(0) = Q(0) = 1. 

2. Prove the converse to Proposition 11.1.1. 


3. Give the details of the proof that N s is independent of the field F s (see the concluding 
paragraph to Section 1). 


4. Calculate the zeta function of x 0 x x — x 2 x 3 = 0 over F p . 

5. Calculate as explicitly as possible the zeta function of a 0 xl + a x x\ + • • • 4- a n 
over F q , where q is odd. The answer will depend on whether n is odd or even and 
whether ^ = 1 (4) or = 3 (4). 
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6. Consider Xo + xj + x| = 0 as an equation over F 4 , the field with four elements. 
Show that there are nine points on the curve in P 2 (F 4 ). Calculate the zeta function. 
\_Answer: (1 + 2m) 2 /((1 — u)(l — 4 m)).] 

7. Try this exercise if you know a little projective geometry. Let N s be the number 
of lines in P n (F pS ). Find N s and calculate ! N s u s /s. (The set of lines in projective 
space form an algebraic variety called a Grassmannian variety. So do the set of 
planes, three-dimensional linear subspaces, etc.) 

8. If / is a nonhomogeneous polynomial, we can consider the zeta function of the 
projective closure of the hypersurface defined by / (see Chapter 10). One way to 
calculate this is to count the number of points on H f {F q ) and then add to it the 
number of points at infinity. For example, consider y 2 = x 3 over F pS . Show that 
there is one point at infinity. The origin (0, 0) is clearly on this curve. If x # 0, 
write (y/x) 2 = x and show that there are 〆 一 1 more points on this curve. Al¬ 
together we have p s points and the zeta function over F p is (1 — pw) 一 1 . 

9. Calculate the zeta function of y 2 = x 3 + x 2 over F p . 

10. If A ^ 0 in F q and q ~ l (3), show that the zeta function of j； 2 = x 3 + ^ over F q 
has the form Z(u) = (1 + au + qu 2 )/(l — w)(l — qu\ where a e Z and \a\ < 2q m . 

11. Consider the curve y 2 = x 3 — Dx over F p , where D ^ 0. Call this curve C v Show 

that the substitution x = + v 2 ) and 夕 = jv(u + v 2 ) transforms C { into the curve 

C 2 given by u 2 — v 4 — AD. Show that in any given finite field the number of finite 
points on is one more than the number of finite points on C 2 . 

12. (continuation) If p = 3 (4), show that the number of projective points on C x is 

just p + 1. If p = 1 (4), show that the answer is p + 1 + x(^)J(x,X 2 ) + X(^)^(X^X 2 \ 
where x is a character of order 4 on F p . 

13. (continuation) If /? = 1 (4), calculate the zeta function of y 2 = x 3 — Dx over F 
in terms of n and x(D), where n = — J(x, x 2 ). This calculation in somewhat sharpened 
form is contained in [23]. The result has played a key role in recent empirical work 
of B. J. Birch and H. P. F. Swinnerton-Dyer on elliptic curves. 

14. Suppose that p 三 1 (4) and consider the curve x 4 + y 4 = 1 over F p . Let x be a 
character of order 4 and n = — J(x, X 2 ). Give a formula for the number of projective 
points over F p and calculate the zeta function. Both answers should depend only 
on n. (Hint: See Exercises 7 and 16 of Chapter 8, but be careful since there we were 
counting only finite points.) 

15. Find the number of points on x 2 + y 2 + x 2 y 2 = 1 for p = 13 and p = 17. Do it 
both by means of the formula in Section 5 and by direct calculation. 

16. Let F be a field with q elements and F s an extension of degree s. If x is a character of 
F, let / = z 。 N Fs/F . Show that 

(a) / is a character of F s . 

(b) / # p implies that / / p'• 

(c) x m = £ implies that x ，m = 

(d) r\a) = x(a) s for aeF. 

(e) As x varies over all characters of F with dividing m, / varies over all characters 
of F s with order dividing m. Here we are assuming that q = I (m). 



Exercises 


171 


17. In Theorem 2 show that the order of the numerator of the zeta function, P(u) has 
degree m~ l ((m — 1)” +1 + (— l) w+ l (m — 1)). 

18. Let the notation be as in Exercise 16. Use the Hasse-Davenport relation to show that 

= (- l) (s_ 1)(n_ l) J(Xu X 2 , • • •, Xn)^ where the are nontrivial 
characters of F and X 1 X 2 * * • ^ £ - 

19. Prove the identity ^ X(f)t degf = ]^[ (1 — A(/)r deg/ ) _ where the sum is over all 
monic polynomials in F[t] and the product is over all monic irreducibles in F[r]. 
X is defined in Section 4. 

20. If in Theorem 2 we keep / fixed but consider the base field to be F s instead of F, 
we get a different zeta function, Z ( / \u). Show that Z} s) (w) and Z f (u) are related by 
the equation Z} s) (w s ) = Z f (u)Z f (pu) - - - Z f (p s ~ x u\ where p = e 2ni/s . 

21. In Exercise 6 we considered the equation xj 4- xj H- X 2 = 0 over the field with 
four elements. Consider the same equation over the field with two elements. The 
trouble here is that 2 笋 1 (3) and so our usual calculations do not work. Prove that 
in every extension of Z/2Z of odd degree every element is a cube and that in every 
extension of even degree, 3 divides the order of the multiplicative group. Use this 
information to calculate the zeta function over Z/2Z. ^Answer: (1 + 2u 2 )/ 
(1 - u)(\ - 2w).] 


22. Use the ideas developed in Exercise 21 to show that Theorem 2 continues to hold 
(in a suitable sense) even when the hypothesis ^ = 1 (m) is removed. 

23. Let p { < p 2 < P 3 < ' ■ - denote the positive prime numbers arranged in order. Let 

N m — '' - pZ and let E m denote the field with q Nm elements. Show that E m 

can be considered as a subfield of E m+l and that £ = (J is an extension of£ 0 = F, 
a finite field with q elements, with the following property; for every positive integer 
n, E contains one and only one subfield F n with q n elements. 
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In this chapter we shall introduce the concept of an 
algebraic number field and develop its basic properties. 
Our treatment will be classical, developing directly only 
those aspects that will be needed in subsequent chapters. 
The study of these fields, and their interaction with other 
branches of mathematics forms a vast area of current 
research. Our objective is to develop as much of the 
general theory as is needed to study higher-power recip¬ 
rocity. The reader who is interested in a more systematic 
treatment of these fields should consult any one of the 
standard texts on this subject ， e.g” Ribenboim [207], 
Lang [168], Goldstein [140], Marcus [183]. 

We will assume that the reader has some familiarity 
with the theory of separable field extensions as can be 
foundjor example, in Merstein’s Topics in Algebra [150]. 
Some of the results assumed are given in the Exercises. 


§1 Algebraic Preliminaries 

In this section we will recall some facts from field theory and prove some 
results about discriminants. 

Let L/K be a finite algebraic extension of fields. The dimension of L/K, 
[L : X], will be denoted by n. 

Suppose a 1? a 2 ,..., a„ is a basis for L/K and oce L. Then aa t - = Yj a u (X j^ 
with a u e K. 

Definition. The norm of a, N L/K (a )， is det(a"). The trace of a, t L/K (a) 9 is a n + 

^22 + … + 

It is easy to check that this definition is independent of the choice of a 
basis. In what follows, norm and trace will be denoted by N and t since the 
extension L/K will be fixed. 

If a, P g L and a e K then N(afi) = N(a)N(j8), t(a + j8) = r(a) + t(j8), 
N(aP) = a n N(P), and t(aoc) = at(oc). If a ^ 0 then N(a)N(oc~ l ) = N(aa _1 )= 
N(l) = 1. Thus, if a ^ 0,N(a) # 0,andiV(a _ x ) = N(a)~ K If L/K is separable, 
then t is not identically zero. If char K = 0 this is easy to see since then t(l)= 
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n ^ 0. The only fields of characteristic p > 0 that we will consider are finite 
fields and in this case the result follows from Proposition 11.2.1(d). 

SupposeL/X is separable and let cr l9 <t 2 , •. •, be the distinct isomorphisms 
of L into a fixed algebraic closure of K which leave K fixed. For a e L denote 
(Tj(a) by a (J \ The elements a (J) are called the conjugates of a. Here a (1) is a. 

One can show using linear algebra (see Exercises 21-23). t(a) = a ⑴ + 
a (2) + … + a ⑻ and that N(a) = a (1) a (2) - - - a (n) . If a e L consider f(x)= 
(x — a (1) )(x — a ⑵) • (x — a ⑻). Then f(x) e K\_x]. The coefficient of x n ~ l 
is — t(a) and the constant term is (— l) n N(oc). The reader should verify that our 
definitions of norm and trace generalize those of Chapter 11, Section 2. 

Definition. If a l5 a 2 ,..., a„ is an n-tuple of elements of L we define the 
discriminant A(a 1? … ， a„) to be det(t(a i a J )). 

Proposition 12 . 1 . 1 . If A(a 1? • • • ， a„) # 0 then a 1? ..., a n is a basis for L/K. 
IfL/K is separable and a l5 ..., a w is a basis for L/K then A(a 1? • • • ， a„) 尝 0. 

Proof. Suppose a 1? ..., a w are linearly dependent. Then there exist 
a 1 ,...,a n eK, not all zero, such that [ a^i = 0. Multiply this equation by 
ocj and take the trace. One finds 

X aitiotiOtj) = 0, 7 = 1, 2,..., n. 

i 

This shows that the matrix (^a,^)) is singular and so its determinant is 
zero. 

Now suppose a 1? ..., a n is a basis and A(a l5 … ， a„) = 0. Then the system 
of linear equations 

Z = 0 ， j = 

i 

has a nontrivial solution 'x ( - = a { e K, i — 1,. ..,n. Let a = ^ a i <i i # 0. 
Then, t(aotj) = 0 for j. = 1 ， 2,… ， n，and since a l5 ..., a„ is a basis it follows 
that t(aj?) = 0 for all P e L. This implies t is identically zero which it is not 
since L/K is separable. This establishes the second assertion. □ 

Proposition 12 . 1 . 2 . Suppose a 1? ..., a„ and jS 1? ..., are bases for L/K. Let 
o^i = Zj a ij^r a ij e K ' Then A(ai ， … ， a„) = det ( 〜 ) 2 △(& ， … ， J?„). 

Proof. Take the trace of both sides of the identity a f a k = ^ ^ 

Let A = B = and C = (a 0 ). Then we find the matrix 

identity, A = CBC, where C' is the transpose of C. Taking the determinant 
of both sides of this matrix identity and noting that det C = det C gives the 
result. 

Proposition 12 . 1 . 3 . For a l5 a 2 ,. • •, g L and L/K separable we have 

A(a l5 … ， a n ) = det(a^) 2 . 
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Proof. t(a t a)) = a^a) 1 ) + afaj 2 ) + … + a^ n) aj n) . Let A = and 

B = (af } ). Then A = BB\ Taking determinants of both sides of this matrix 
equation gives the result. □ 

Proposition 12.1.4. Suppose 1, j8,..., p n ~ 1 are in L and linearly independent 
over K. Let f (x) e K[x] be the minimal polynomial for P over K. If L/K is 
separable then 

A(i, m = ( - 1)_- i))/2 N(rm 

where f f (x) is the formal derivative of f(x). 

Proof. The matrix ((P ij) ) 1 ) where j = and i = 0,..., w — 1 is of 

Vandermonde type and so its determinant is 

n (p (j) - p (i) y 

* <j 

Thus we have 

A(l，m = - J8 (0 ). 

i 幸 j 

Now, / (x) = JJi (x — P (i) \ so f\P ij) ) = I~[/ (P U) ~ ^ ( °) with # j. Since 
f f (P ij) ) = (f\P)) U) the result follows by taking the product over j. □ 


§2 Unique Factorization in Algebraic Number Fields 


Elementary number theory is concerned with the properties of the natural 
numbers 1, 2, 3,.... In the course of studying these properties it became 
necessary to take into account the ring of integers Z and then the field of 
rational numbers Q. In his attempt to understand biquadratic reciprocity 
Gauss introduced the ring Z[f]. Likewise to study higher reciprocity laws and 
Fermat’s Last Theorem (see Chapter 14) other rings were introduced. 
Eventually a general definition of an algebraic number fields and rings of 
algebraic integers emerged, principally through the efforts of E. Kummer and 
R. Dedekind. 

Definition. A subfield F of the complex numbers is called an algebraic number 
field if [F : Q] is finite. If F is such a field, the subset of F consisting of algebraic 
integers forms a ring D, called the ring of algebraic integers in F. 

Proposition 6.1.2 shows that an algebraic number field consists of alge¬ 
braic numbers (just take V — F and choose y l5 ..., y n to be a basis for F 
over Q). 

Let Q be the set of all algebraic integers. Then Proposition 6.1.5 shows Q 
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is a ring. Since D — Q n F } D is also a ring. We will often refer to D simply 
as the ring of integers in F. 

It turns out that in general D is not a unique factorization domain 
(Exercise 7). However D does have a property which is almost as good. 
Namely, every nonzero ideal can be written uniquely as a product of prime 
ideals. An integral domain with this property is called a Dedekind ring. 
In this section we will prove that D is a Dedekind ring following a method 
due to A. Hurwitz [154] (pp. 236-243). 

Throughout the discussion the word ideal will mean nonzero ideal. 
Hopefully, this will not cause confusion. 

Lemma 1. Suppose P e F. There is a b e Z, b 关 0, such that bp e D. 

Proof. j8 satisfies an equation a 0 p n + 1 + … + = 0 with the 

a t € Z,a 0 # 0. Multiply both sides by a n 0 ~ 1 and notice that (a 0 P) n + a l (a 0 P) n ~ 1 
+ … + a n a n 0 ~ 1 = 0. This shows a 0 jS is an algebraic integer since for all /, 
1 e Z. □ 

Proposition 12 . 2 . 1 . Every ideal A of D contains a basis for F over Q. 

Proof. Let ..., bea basis for F over Q. By the preceding lemma there is 
a b e Z, b / 0, such that bp u • • • ， bp n e D. Choose oc e A, a ^ 0. Then the 
elements , bp n oc are in A and are a basis for F over Q. □ 

In the first section we considered a field extension L/K and considered 
the trace, norm, and discriminant of a basis. Here we fix the extension F/Q 
and consider all these concepts with respect to this extension. 

If a g D we claim N(a) and t(oc) are in Z. To see this notice that if a satisfies 
a monic polynomial with coefficients in Z so do the conjugates of a. Thus 
N(a) and t((x) which are respectively the product and sum of the conjugates 
of a are algebraic integers. They are also in Q so by Proposition 6.1.1 they are 
in Z. The fact that the trace has this property shows that if a 1? ..., a„ is a 
basis for F over Q and all the a, e D then A(a 1? ..., a„) g Z. 

Before proceeding we remark that the discriminant of a basis can be 

negative. For example, let i = — 1 and consider the basis 1, i forQ(i)/0. A 

simple calculation shows A(l, i) = — 4. 

Proposition 12 . 2 . 2 . Let A be an ideal in D and suppose ... ,a n e A is a basis 
for F/Q with | A(a l9 … ， a„)| minimal. Then A = /aj + Za 2 十 ••• + Za„. 

Proof. Since the absolute value of the discriminant of a basis in ^ is a positive 
integer, there is such a basis with | A(a l9 ..., a„)| minimal. 

Suppose a e A and write a = y 1 oc 1 + y 2 a 2 + * * • + with y t e Q. 
We need to show that the y,- are in Z. Suppose not. Then some ^ Z and by 
relabeling if necessary we can assume $ Z. Write = m + 0 where me Z 
and 0 < 0 < 1. Let ^ = a — ma 1 , jS 2 = a 2 ,..., j8 n — a n . Then jS 2 , …， 
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By Proposition 12.1.2 we find A(j8 1? ..., j8 w ) = 0 2 A(a 1? ..., a w ) which 
contradicts the minimality of | A(a 1? ..., a„) | since 0 < 0 < 1. Thus all the 
e Z and 4 = Zoq + • • • + Za„ as asserted. □ 

If a 1? a 2 ,..., a„ g ^4 is a basis for F over Q and A = Za x + . • • + Za„ 
we say that a 1? ..., a„ is an integral basis for A. It follows from Proposition 
12.1.2 that the discriminants of any two integral bases for A are equal. This 
common value is called the discriminant of A, written A ⑷. The discriminant 
of D is particularly important and, by “abuse of language,” S F = A(D) is 
called the discriminant of F/Q. 

We now apply the last proposition to deduce some important properties 
of the ring D. Recall our convention that all ideals are nonzero ideals. 

Lemma 2. If A c= D is an ideal then A n Z ^ 0. 

Proof. Let a g A, a / 0. There exist a,- e Z such that a m + a l oc m ~ 1 + . • • 
+ a m = 0. Since we are working in a field we may assume a m # 0. But then, 
0 ^ a m E A n Z. □ 

Proposition 12.2.3. For any ideal A, D/A is finite. 

Proof. By the lemma there is an a e A n Z, a # 0. Let (a) be the principal 
ideal generated by a in D. Since D/(a) maps onto D/A it is enough to show 
D/(a) is finite. In fact we will show it has precisely a n elements. 

By Proposition 12.2.2 we may write D = T(x) l + Zco 2 + •.. + Zco n . 
Let S = < a}. We claim 5 is a set of coset representatives for 

D/(a). Suppose co = ^ m t co, g D. Write m t = + y, with 0 < 7, < a. 

Then clearly co 三工 (a). Thus every coset of A contains an element of S. 
If [ 7 iCo t - and [ are in S and in the same coset modulo (a) then using the 
linear independence of the co, we see — y\ is divisible by a in Z. Since 
o < < ait follows thaty,- = y[. Thus 5 is a set of coset representatives and 

D/(a) has a n elements as claimed. □ 

Corollary 1. D is a Noetherian ring, i.e.，every ascending chain of ideals 
A 2 ^ A 3 terminates. In other words, there is an N > 0 such that 
A m = A m+l for all m > N. 

Proof. Since D/A 1 is finite there are only finitely many ideals containing 
A,. □ 


P n e A and is a basis for F/Q. Since p x = 6oc l + + • • • + the matrix 

of transition between these two bases is 


yno o 1 

•••• 

•••• 
_••• 

P 3 0 1 o 


72 10 0 

0 o o o 

/- \ 
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Corollary 2 . Every prime ideal of D is maximal 

Proof. If P is a prime ideal then D/P is a finite integral domain. Such a ring 
is necessarily a field (see Exercise 19). Thus D/P is a field and so P is maximal. 

□ 

The ring D is also integrally closed. This means that if a e F satisfies a 
monic polynomial with coefficients in D then ol e D. This is not too hard to 
establish using Proposition 6.1.4. In standard algebra texts it is shown that 
if an integral domain is Noetherian, integrally closed, and every nonzero 
prime ideal is maximal then every ideal is a product of prime ideals in a 
unique way, i.e., such a ring is a Dedekind domain. We will establish the fact 
that D is a Dedekind domain in a different way using a very important 
property of number fields, namely that the class number of D is finite (see 
below). 

Our initial goal is to prove the following two results: 

(i) and C are ideals and AB = AC, then B — C. 

(ii) HA and B are ideals and A cz B, then there is an ideal C such that 4 = BC. 

These will be proved later. We begin by establishing a special case of (i). 

Lemma 3. Let A cz D be an ideal. If P e F is such that PA cz A then P e D. 

Proof. By Proposition 12.2.2 4 is a finitely generated Z module so the result 
follows from Proposition 6.1.4. □ 

Lemma 4 . If A and B are ideals in D and A = AB then B = D. 

Proof, Let a 1? a 2 , • • • ， a„ be an integral basis for A. Since A = AB we can 
find elements e B such that a t = It follows that the determinant 

of the matrix (b 。一 is zero. Writing this out shows 1 e B, i.e ,， B = D. [J 

Proposition 12 . 2 . 4 . Let A，B c D be ideals and suppose coe D is such that 
(co)A = BA. Then (co) = B. 

Proof, If J? g 5 we see (p/co)A c >1 so by Lemma 3, jS/co e D. It follows that 
Be (co) and so co~ 1 B D is an ideal. Since A = co~ l BA, Lemma 4 shows 
co~ 1 B = D and so B = (co) as required. □ 

The following definition plays a major role in algebraic number theory. 

Definition. Two ideals A, B D are said to be equivalent, 4 〜 if there 
exist nonzero a, j8 e D such that (a)A = (P)B. This is an equivalence relation. 
The equivalence classes are called ideal classes. The number of ideal classes, 
h F ,is called the class number of F. (We will see that h F is finite.) 

We leave the easy verification that 4 〜 5 is an equivalence relation to 
the reader. 
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It is worthwhile to point out that /i F = 1 if and only if D is a principal 
ideal domain (PID). To see this suppose h F = l and let A be an ideal. Since 
A 〜 D there are nonzero a, e D such that (a)^ = (P)D = (j5). Thus j5/a e A 
and A = (j8/a). Every ideal is principal. On the other hand it is obvious that 
if D is a PID then h F = 1. 

Thus we see that the class number measures, in some sense, how far D is 
from being a PID (see Exercises 15, 16 and Masley [184]). 

The following lemma is due to A. Hurwitz [154], p. 237. We will use it to 
show h F is finite. It is to be noticed that the lemma is a (weak) generalization 
of the Euclidean algorithm to an arbitrary number field. 


Lemma 5. There exists a positive integer M depending only on F with the 
following property. Given a, j8 g D, # 0, there is an integer t, l < t < M, 
and an element co e D such that | N(ta — coj8)| < | N(P) \. 

Proof. We first reformulate the statement slightly. Let y = a/j8 g F. Then it is 
sufficient to show that for all y g F there is an M such that | N(ty — co)| < 1 
for some 1 < t < M and co e D. 

Let co u o) 2 ,..., be an integral basis for D. For y e F, y = Y!l=i y^i 

with y t g Q. Notice that 

I N (y) I = n < c^max|y t | , 

where C -ad Choose an integer m > ^/~C and set M = m n . 

For y e F, y = ⑴ h write + b ( where and 0 < b t < 1. 

Let [y] = ^ =1 aiCOi and {y} = Then y = [y] + {y} where 

[y] g D and {y} has coordinates between 0 and 1. 

Map F to Euclidean n-space 1R W by <t>(h 7i ⑴ i) = (7i ， 72 , ... ， 7”). For 
any y e F, lies in the unit cube. Partition the unit cube into m” subcubes 
of side 1/m. Consider the points (j>({ky}) for 1 < /c < + 1. By the pigeon¬ 

hole principle two of them, at least, must lie in the same subcube, say those 
corresponding to hy and ly. If we write hy = [hy] + {hy} and ly = [/y] + {ly} 
and subtract we find ty = co + d where (assuming h>l)t — h — l<m n 
=M, co e D ，and the coordinates of 3 have absolute value less than or equal 
to 1/m. 

By our previous remark, N(S) < C(l/m) n = C/m n <1. [ 

Theorem 1. The class number of F is finite. 

Proof. Let A be an ideal in D. For a g a # 0, | N(a) | is a positive integer. 
Choose P e A, p ^ 0, so that \N(P)\ is minimal. For any a e A there is a t, 
1 < t < M, such that | N(toc — cop)\ < |iV(j?)| with coe D. Since ta — cofi e A 
we must have ta — cop = 0. It follows that Ml A cz (j5). Let B = 
cz D. B is an ideal and M\A = (P)B. Since P e A, M!j8 e (P)B 
and so Ml e B. By Proposition 12.2.3M! can be contained in at most finitely 
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many ideals. We have shown A 〜 B where B is one of at most finitely many 
ideals. Thus h F is finite, as asserted. □ 

An interesting and significant application of this theorem is the following 
proposition. 

Proposition 12.2.5. For any ideal A c ： D there is an integer k，1 < k < h F , 
such that A k is principal. 

Proof. Consider the set of ideals {A l \ 1 < i < h F 1}. At least two of these 
ideals must lie in the same class, say X 1 〜 with i < j. There exist a ， p g D 
such that (a)^ 1 = (P)A j . Let k = j — i and B = A k . We will show that B is 
principal. 

Since, clearly, (ol)A 1 = (P)BA l we see {cl/P)A 1 c ： y4 l so ct/p e D. Let a> = a/j8. 
Then = BA\ By Proposition 12.2.4, (cl>) = B. □ 

We remark that the set of ideal classes can be made into a group. Let A 
denote the class of A. We define the product of A and B to be AB. One can 
check without trouble that this is well defined, i.e., if A = and B = B l 
then AB = Associativity follows from the fact that ideal multiplication 
is associative. The class of D serves as an identity element. Finally, the last 
proposition shows that an inverse to A is the class A k ~\ The structure of the 
class group has been a major research problem ever since the concept was 
invented. 

One consequence of the fact that the ideal classes form a group is that 
A Hf is principal for all ideals A. This will not be needed in the remainder of 
this chapter. 

We can now give proofs for the two results mentioned earlier (before 
Lemma 3). 

Proposition 12.2.6. If A, B, and C are ideals, and AB = AC, then B = C. 

Proof. By the last proposition, there is a fc > 0 such that A k = (a). Multiply 
AB = AC on both sides by A k ~ l . We find (<x)B = (a)C. It follows that 
B = C. □ 

Proposition 12.2.7. If A and B are ideals，such that B zd A, then there is an 
ideal C such that A = BC. 

Proof. As above there is a fc > 0 such that B k = (P). 

Now, since A cz B wq have B k ~ 1 A a B k = (P) so C = {\/P)B k ~ a D 
is an ideal. 

Thus, BC = (l/P)B k A = (1/ 卿 ) A = A. □ 

This proposition can be phrased “to contain is to divide.” 

We now have all the tools we need to establish unique factorization into 
prime ideals. 
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Proposition 12.2.8. Every ideal in D can be written as a product of prime ideals. 

Proof. Let 乂 be a proper ideal. Since D/A is finite, A is contained in a maximal 
ideal P i (using Zorn’s lemma one can show that in an arbitrary commutative 
ring with identity a proper ideal is contained in a maximal ideal). By the last 
proposition A = P l B l for some ideal B x . If B x # D then is contained in a 
maximal ideal P 2 and so ^ = P l P 2 B 2 . li B 2 ^ D we can continue the 
process. Notice that A a c B 2 • • * is a proper ascending chain of ideals. 
By Corollary 1 to Proposition 12.2.3 we see that in finitely many steps B t = D. 
Thus A = PiP 2 ' - Pf □ 

Let P be a prime ideal. The descending chain P 〕 P 2 〕 P 3 ••• is proper 
since if P l = P i+l for some i then PP 1 = P 1 and so P = D by Lemma 4. 
This fact is the basis of the following definition. 

Definition. Let P be a prime ideal and A an ideal. Then ord P X is defined to 
be the unique nonnegative integer t such that P ( zd A and P t+1 A. 

Proposition 12.2.9. Let P be a prime ideal and A and B ideals. Then 

(i) ordp P = l 

(ii) If F 妾 Pis prime ovd P F = 0 

(iii) ordp AB = ovd P A + ord P B 

Proof. The first assertion is clear. As for (ii) assume ord F P f > 0. Then 
P zd P\ Since prime ideals are maximal P = P f contradicting the assump¬ 
tion. 

Let t = ordp A and s = ord F B. By Proposition 12.2.7 we have A = P t A l 
and B = P s B l . By the same proposition we must have P ^ A l and P 

Now, AB = If P s+t+l =) AB then AB = P s+f+1 C and so by 

Proposition 12.2.6, PC = A X B V This implies P ^ A 1 B 1 and since P is prime 
that P zd A l ov P zd B v This is a contradiction. 

Thus ordp AB = t s = ord P A -h ord P B. □ 

Theorem 2. Let A a D be an ideal Then A = \\ P a{P) y^here the product is 
over the distinct prime ideals of D, and the a(P) are nonnegative integers all 
but finitely many of which are zero. Finally, the integers a(P) are uniquely 
determined by a(P) = ord P A. 

Proof. The product representation follows from Proposition 12.2.8. 

Let P 0 be a prime ideal and apply ord Fo to both sides of the product given 
in the theorem. Using Proposition 12.2.9 we see 

ord Po A =Y, a ( 尸） ovd Po (P) = a(P 0 ). □ 

p 
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Let P be a prime ideal of D. By Lemma 2, P n Z is not zero. Since it is clearly 
a prime ideal of Z it must be generated by a prime number p. 

Definition. The number e = ord P (p) is called the ramification index of P 
(here (p) is the principal ideal generated by p in D). 

D/P is a finite field containing Z/pZ. Thus the number of elements in D/P 
is of the form p f for some / > 1. The number/is called the degree of P. 

Let p e Z be a prime number and let P 2 ,^ be the primes in D 
containing (p). Let e, and / be ramification index and degree of P ( . By 
Theorem 2, (p) = P\ x P e ^ - • - P^. 

There exists a remarkable relation among the numbers e t ,f h and n. 
Theorem 3. i e ifi = n - 

We postpone the proof until we have developed some necessary back¬ 
ground. 

Proposition 12.3.1. Let R be a commutative ring with identity. Suppose A l9 
A 2 ,... ,A g are ideals such that + Aj = R for i ^ j. Let A = A X A 2 - - - A g . 
Then 

R/A rt ㊉ R/A 2 ㊉ • ••㊉ R/A g . 

Proof. Let ip, be the natural map from R to R/A { and define ij/: R — 

尺 Ml ㊉…㊉ R/A g by \jj(y) = 0/^(7) ， ^ 2 ( 7 )， …， We will show i// is 

onto and the kernel is A. 

To show \p is onto, it is sufficient to show that for any y u y 2 ,..., y g g R 
the set of simultaneous congruences x 三 i = 1 ，…，分 is solvable. 

Expanding the product (A l -h A 2 )(A l + + A g )= 尺 we see 

that all the summands, except the last, are 'm A v Thus A l -h A 2 A 3 - • A g =R. 
There exist elements i; 1 e A x and u 1 e A 2 ' - A g such that u l = 1. Then 
u 1 = 1 {A x ) and u x = 0 {A t ) for i ^ 1. Similarly, for each j there is a Uj 

such that Uj = 1 (Aj) and Uj = 0 (/l t ) for i ^ j. It is then clear that x = 

7i w i + + • • • + y g u g i s a solution to our set of congruences. 

Having shown that ^ is onto, we now investigate the kernel. Clearly, 
ker il/ = n A 2 n - - n A g . We must show that under the hypotheses the 
intersection is equal to the product. This can be done by induction on g. 
Suppose g = 2. Then, since A 1 A 2 = R, there exist a x £ A x and a 2 e A 2 
such that -j- a 2 = If a e A l n A 2 then a = aa l -h aa 2 g A 1 A 2 .This shows 
A l n A 2 ^ ^ 1 ^ 2 - The reverse inclusion is obvious so the result follows for 
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g = 2. Now suppose g > 2 and we know the result for g — l. Then A 1 n 
A 2 r\ - • n A g = A 1 n A 2 A 3 - - - A g . However,^! + A 2 A 3 • • • A g = R by the 
first part of the proof. Thus, n A 2 A 3 - • A g = A t A 2 -- and the proof 
is complete. □ 

This proposition is called the Chinese Remainder Theorem for rings. We 
return from a general commutative ring R to D. 

Proposition 12.3.2. Let P a D be a prime ideal and let p f be the number of 
elements in D/P. The number of elements in D/P e is p ef . 

Proof. The assertion is true for ^ = 1. If ^ > 1 then D/P e has P e ~ 1 /P e as a 
subgroup and the quotient is isomorphic to D/P e ~ 1 (second law of isomor¬ 
phism). If we can show P e ~ 1 /P e has p f elements then the result will follow 
by induction. 

Since P e a P e ~ 1 properly we can find an a e P e ~ 1 such that a ^ P e . We 
claim (a) P e = P e_1 . Since P e a (a) -h P e the latter ideal must be a power 
of P. Since (a) - P e a P e ~ 1 we must have (a) + P c = P e ~ 1 . 

Map D to P e ~ l /P e by y -> ya -h P e . This is easily seen to be a homo¬ 
morphism onto. An element y is in the kernel if and only if ya e P e ， i.e., iff 
ordp(ya) > e. Now, ord F (ya) = ord 尸 (y) + ord P a = ord 尸 (y) - e — 1. Thus 
y is in the kernel iff ord P (y) > 1 which is equivalent to saying y e P. Thus 
D/P ^ P e ~ l /P e and so the latter group has 〆 elements. □ 

We can now prove Theorem 3. Remember (p) = P\ l P e 2 2 - - - P e g 9 . It is 
not hard to see that -I- Py = D for i ^ j (see Exercise 25). By Proposition 
12.3.1 

D/(p) ^ ZW ㊉ D/P e 2 2 ㊉…㊉ D/P e /. 

The proof of Proposition 12.2.3 shows \D/(p)\ = p n . On the other hand 
Proposition 12.3.2 shows | D/Pf ( | has p eifi elements. Thus 

p n = p ei f 1 p e2 f 2 • • • p e J 

It follows that n = e i f l + e 2 f 2 + + e g f g as asserted. C 

When F/Q is a Galois, that is, when all the isomorphisms of F into C 
are actually automorphisms, Theorem 3 can be strengthened. Suppose 
F/Q is Galois and let G be the Galois group. If A is an ideal and a g G let 
a A = {aa I a g A). One easily checks that a A is again an ideal. Also, gD = D. 
Thus D/a A = gD/gA ^ D/A. In particular this shows that if P is a prime 
ideal, then aP is also a prime ideal. 

Proposition 12.3.3. Let p e Z be a prime number. Suppose P, and Pj are prime 
ideals of D containing p. Then there is aa g G such that gP { = P } . 



§3 Ramification and Degree 


183 


Proof. Suppose there is a prime ideal P 0 containing p and not in the set 
{aPi I g g G}. By Proposition 12.3.1 we can find an a g D such that a ^ 0 (P 0 ) 
and a = 1 (cP,) for all g e G. 

Then iV(a) = f] (TgG aa e P 0 n Z = pZ. It follows that N(cc) e P t and so 
gol g P t for some a since P ( is prime. But then ole g~ 1 P i contradicting 
a = 1 (a~ 1 P i ). □ 

Theorem 3'. Suppose F/Q is a Galois extension. Let p gZ be a prime number 
and write (p) = P\ x P e ^ - - - P e g 9 . Thene^ = e 2 = … =e g andj\ = f 2 = … =f g .lf 
e and f denote these common values, then efg = n. 

Proof. Fora given index i there is a (7 e G such that gP 1 = P t . Since D/P 1 % 
D/aP 1 = D/Pf we find f x = /•. Thus all the/’s are equal. 

Apply g to both sides of (p) = Since p e Z it is clear that 

a(p) = (p). Thus 

(p) = 

In this product we see the exponent of ^ — aPj is e v In the first expression 
the exponent of P { is e { . By uniqueness of prime factorization we must have 
e r = e t and so all the e^s are equal. 

Finally, since Y, e i / = « we see immediately that efg = n. □ 

We conclude this section by discussing, without proofs, some important 
facts about number fields. In our applications we will be able to do without 
this general theory. 

Let P c= D be a prime ideal with ramification index e. Let P n Z 
We say that P is a ramified prime if e > 1. One can show that P is ramified 
only if p divides d F = A(D\ the discriminant of F. In particular, only finitely 
many primes are ramified. If pJfS F then (p) is a product of distinct prime 
ideals in D. An important result of Minkowski asserts that if [F: Q] > 1 
then |^ F | > 1. In fact Minkowski found a more precise result, namely an 
explicit lower bound for |^ F |. An important consequence is that every 
number field strictly bigger than Q contains ramified primes. 

Now suppose F/Q is a Galois extension with group G. Associate with a 
prime ideal P the group G(P) = {aeG|aP = P}. G(P) is called the de¬ 
composition group of P. D/P is a finite field containing Z/pZ. The field 
D/P is a Galois extension of Z/pZ. Call the Galois group G. There is a homo¬ 
morphism from G(P) to G given as follows. If a e G(P) and a denotes the 
residue class of a in D/P define d by the equation d{a) = du. This is well 
defined, d e G, and od is a homomorphism. One can show this homo¬ 
morphism is onto (Exercise 26). Let T(P) be the kernel. T(P) is called the 
inertia group of P. We have 

G(P)/T(P) ^ G. 

It is not hard to see that |G| = / and \G(P)\ = n/g = ef. It follows that 
I T(P)\ = e. Thus, if P is unramified G(P) » G. 
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From the theory of finite fields G is a cyclic group generated by the 
automorphism which takes a to a p . If P is unramified there is a unique 
o P e G(P) such that g p = (j) p . This automorphism g p is called the Frobenius 
automorphism associated to P. Notice that the order of a P is equal to the 
order of which is/, the degree of P. As it turns out, a large part of the 
arithmetic theory of algebraic number fields centers around the properties 
of the Frobenius automorphism. We will see illustrations of this in the next 
chapter. 

Notes 

The fact that the ring of integers in an algebraic number field forma Dedekind 
ring is due to R. Dedekind and appears in the eleventh supplement to 
Dirichlet’s Vorlesungen uber Zahlentheorie [127]. This result was subse¬ 
quently also proven by Kronecker, Hilbert, and Hurwitz. The inertia and 
decomposition groups were introduced by Hilbert (1894) in his “Grundziige 
einer Theorie des Galoisschen Zahlk6rpers” (see also §39 of Hilbert’s 
“Zahlbericht” [151] and Dedekind [121], Vol. 2, pp. 43-49). 

It can be shown more generally that if D is a Dedekind ring with field of 
fractions k and X is a finite separable extension of k the integral closure of 
D in X (Exercise 27) is a Dedekind ring. This follows from a theorem of E. 
Noether characterizing Dedekind rings as Noetherian domains which are 
integrally closed and in which every nonzero prime ideal is maximal. For this 
approach see Samuel-Zariski [214]. In our approach, as in other classical 
approaches, essential use is made of the fact that the residue class ring modulo 
a nonzero ideal is finite. The idea of deriving the Dedekind property from the 
finiteness of the class number is due to Hurwitz. It will be noticed that in our 
approach no use is made of the fact that the number of elements in the residue 
class ring is a multiplicative function of the ideal. Butts and Wade [103] have 
shown that the multiplicativity of this map implies the Dedekind property. 
The usual classical approach is to show by a suitable generalization of Gauss’ 
lemma (Exercise 4, Chapter 6) that the ideal classes form a group. 

Recently the characterization of fields F with class number 2 due to 
Carlitz (see Exercises 15 and 16) has been generalized by A. Czogala [117]. 
He proves, among other things, that a number field has an ideal class group 
which is cyclic of order 2, cyclic of order 3, or the Klein four group iff the 
product of two irreducibles may be rewritten as the product of at most three 
other irreducibles. 

A deep result conjectured by Hilbert and proved by Furtwangler asserts 
the existence, for each number field F, of an extension E satisfying the follow¬ 
ing conditions. First of all the degree of E over F is equal to the class number of 
F. Every prime ideal ^3 of F decomposes into the product of h F /f distinct 
prime ideals in E where/is the order of the ideal class of ^ in the class group. 
Every ideal of F becomes principal in E. Finally the ideal class group of F 
is isomorphic to the Galois group of E over F. The field E is unique and is 
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called the Hilbert class field of F. The existence of the Hilbert class field is a 
valuable tool in studying the structure of the ideal class group. 

The actual calculation of the class number is a difficult matter. Even for 
quadratic number fields of small discriminant the calculation requires 
estimates (due to Minkowski) which we have omitted. These matters are 
discussed in most standard texts on algebraic number theory. We recommend 
the treatment in D. Marcus [183]. This book contains a large number of 
interesting exercises. 

In more recent texts it is customary to describe the ideal class group in 
terms of fractional ideals. If D is an integral domain with field of fractions F, a 
fractional ideal ^4 is a D submodule of F for which there exists an element d 
in D with dA a D. Fractional ideals can be multiplied in the obvious way. 
It can be shown that D is a Dedekind ring iff the (nonzero) fractional ideals 
form a group [214]. The subgroup of fractional ideals of the form fD with 
f in F are the principal fractional ideals. It is not difficult to show that the 
ideal class group of an algebraic number field is isomorphic to the quotient 
group of the group of fractional ideals by the subgroup of principal fractional 
ideals. 


Exercises 


1. Find the minimal polynomial for ^/3 4- x / / 7. 

2. Compute the discriminant of 0(^/2 -f ^/S). 

3. Describe the units in 0(^/5). 

4. Let D be the ring of integers in Q(y/d). Show that, given N > 0, there are at most 
finitely many integers <xe D with max( |a|, la'I) < AT, where is the conjugate of a. 

5. Generalize Exercise 4 to an arbitrary number field. 

6. If D is the ring of integers in an algebraic number field and 平 is a prime ideal such 
that = (a) then show that a is irreducible. 

7. Show that the class number of Q(y/~^5) is greater than one. 

8. Let F be a number field. Show that the discriminant S F is congruent to 0 or 1 modulo 
4. This is one of Stickelberger’s theorems. The proof is tricky (cf. [207], p. 97). 

9. Compute the discriminant A(l, a, a 2 ), relative to Q(a), where a is a root of the 
reducible cubic x 3 px q, p, q e Q. 

10. URczS are integral domains a e S is said to be integral over R if a m + — 1 + … 

-¥ b m = 0 for suitable m; b u ... ,b m e R.S is called integral over R if every element of 
S is integral over R. Prove that if S is integral over R then S is a field iff R is a field. 

11. Let oc u .. . ,(x n e D, the ring of integers in a number field F, A(a 1? ..., a n ) ^ 0. Show 
that if A(a 1? ... ， a„) is a product of distinct primes (i.e., A is square free) then a l5 ..., a„ 
is an integral basis. Conclude that if d is square free ^ = 1 (4) then (1 + ■Jd)jX 1 

form an integral basis for the ring of integers in Q(y/d). 
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12. Show that sin( 7 i/ 12 ) is an algebraic number. 

13. Show that (3, 1 + —5) is a proper ideal in Z[^^5]. Is it prime? 

14. Construct an irreducible cubic polynomial over O with only real roots. 


15. Let F be an algebraic number field, D its ring of integers. Suppose the class number 
of F is 2. Show that if n is an irreducible such that (n) is not prime then (n) = 
where U 2 are (not necessarily distinct) prime ideals. 

16. (L. Carlitz) Let F, D be as in Exercise 15. Show that if ole D, a = 7 i,,..., 7r ( = 
yl l5 ..., are two decompositions of a into the product or irreducibles then s = t. 
[Note: The converse is also true! (cf. Carlitz [106]).] 


17. Let / (x), g(x) be the respective minimal polynomials of a and ^ of respective degrees 

n and m. Let the roots in C of/(x) and g(x) respectively be a = a l5 a 2 ,..., a n and 
P = p 2 , … ， p m . Recall by Exercise 16, Chapter 6 , there are no repeated roots. 

Choose r eQ so that a, -I- tpj ^ a tpj ^ 1, all i. Put 7 = a -h Show that 

(a) f(y — tx), g(x) have greatest common divisor (in C[x]) x — p. 

(b) (on the other hand) the greatest common divisor of/(y — tx) and g(x) is in 

Q(y)M- 

(c) ^eQ(y), a eO(y). 

18. (Theorem on the primitive element.) If F is an algebraic number field show that 
there exists an element y e F such that Q(y) = F. 


19. Show that a finite integral domain is a field. 

20. Let/C = F 2 (x) and L = K(y/x). Show that the trace map is identically zero. (Recall, 
F 2 is the finite field with two elements.) 


21. Let F be an algebraic number field of degree w. If a e F, let T be the linear transforma¬ 
tion defined by T(y) = ocy. Show that det(x/ — T) = f(x ) 1 where t = n/deg(/), 
and / (x) is the minimal polynomial of a. 

22. Let F cz E be algebraic number fields. Show that any isomorphism of F into C 
extends in exactly [E : F] ways to an isomorphism of E into C. 

23. Let F be an algebraic number field of degree n and let a l ,...,a n be the distinct 
isomorphisms of F into C. Show that, for a e F, the notation being as in Exercise 21, 

24. The notation being as in Exercise 23 show that 

n n 

A/f 7 Q(a) = J~[ a,-(a) and t F /Q(oc) = Z (T,.(a). 

/ = 1 / = 1 

25. Let F be an algebraic number field with ring of integers D. Show that if P and Q are 
distinct prime ideals then (P a , Q b ) = D, where a and b are positive integers. 

26. Let 尸 be a prime ideal in the ring of integers D of an algebraic number field F. 
If Z 7 is Galois show that the natural map from the decomposition group of P to 
the Galois group of the residue class field is onto. 

27. If /c is a field containing a ring D the set of all elements in k which are integral over D 
(Exercise 10) is called the integral closure of D in k. Show that the integral closure 
is a ring and that it is integrally closed. 
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28. Let D be the ring of integers in a number field F. Suppose (p) = P 2 A for p prime in 
Z and a prime ideal P. Show 

(a) There exists a e PA, oc $ P 2 A. 

(b) (oc^Y e pD all P e D. 

(c) (tr(a^)) p = tr((ocP) p ) (pD). 

(d) p|tr(ajS) all p e D. 

(e) p|A, the discriminant of F. 

(Be sure to use the fact that a ^ pD.) 


29. Let F be a Galois extension of Q with abelian Galois group. Show that if p e Q is 
unramified in F then o ? — o>, for prime ideals P and P' dividing p in F, where a ? 
denotes the Frobenius automorphism. 

30. Let p be an odd prime and consider 0(^/^). If ^ p is prime show that G q {y/p )= 

(p/q)y/p where a q is the Frobenius automorphism at a prime ideal in Q(y/p) lying 
above q. 

31. Let F be an algebraic number field and 贝 an ideal in the ring of integers of F. Show 
that there is a finite extension LoiF with ring of integers S such that 5IS is principal. 

32. Let P be a prime ideal in the ring of integers D of a number field F.lfa = b (PO and 
ordp b < t show that ord P a = ord P b. 

33. LctK c L be number fields with rings of integers R and S respectively. If A and B 
are ideals in R such that AS divides BS then show that A divides B. 

34. The notation being as in Exercise 33 show that AS n R — A. 



Chapter 13 

Quadratic and Cyclotomic Fields 


In the last chapter we discussed the general theory of 
algebraic number fields and their rings of integers. We 
now consider in greater detail two important classes of 
these fields which were studied first in the nineteenth 
century by Gauss, Eisenstein, Kummer, Dirichlet, and 
others in connection with the theory of quadratic forms, 
higher reciprocity laws and Fermat’s Last Theorem. The 
reader who is interested in the historical development of 
this subject should consult the book by H. Edwards [128] 
as well as the classical treatise by H. Smith [72]. 

We will develop in this chapter only those results that 
will be needed for the applications in later chapters. The 
fundamental result describes the manner in which rational 
primes decompose into a product of prime ideals. However, 
we could not resist giving yet another proof of the law of 
quadratic reciprocity based on the decomposition laws of 
these fields. 


§1 Quadratic Number Fields 


An algebraic number field F will be called a quadratic number field if 
[F: Q] = 2. Let D c= F be, as usual, the rings of integers in F. Our first goal 
will be to find an explicit integral basis for D. 

Let F = Q(a). The element a must satisfy a quadratic equation ax 2 -h 
bx c — 0 with a,b,ce Z. Thus 



—b 土 yjb 1 — 4ac 
2a 


Let A = b 2 — 4ac. Then, clearly, F = Let A = A\A 2 where 

A u A 2 g1 and A 2 is square-free. Then F = 0(^/^). Changing notation, we 
have shown that every quadratic number field has the form Q(>/d) where d 
is a square-free integer. 

If a is any isomorphism of F/Q into C we apply a to (^/d) 2 — d and 
find {Gy/d) 2 = d. Thus a^/d = ±-y/d. It follows that F/Q is a Galois 
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extension. The Galois group has two elements, the identity and an 
automorphism taking y/d to 

Every element of F has the form a = r + Syjd with r, s e Q. The nontrivial 

automorphism takes a to on' = r — Sy/d. Thus, f(a) = a + a’ = and 
N(oC) = olol = r 2 — ds 2 . 

If yeD then t(y) and N(y) e Z. Conversely, if these conditions hold then y 
satisfies 0 = (x — y)(x ~ y') = x 1 — t(y)x -f N(y) e Z[x] showing that yeD. 
Thus y e D iff t(y) and N(y) e Z. 

Proposition 13.1.1. If d = 2, 3 (4) then D = Z + Z^/d. 

Ifd = 1 (4) then Z) = Z + Z(( — 1 + y/d)/2). 

Proof. Suppose y = r + s^/d, r 9 seQ. Then yeD iff 2r and r 2 — s 2 d e Z. 
Since 2reZ it follows from the second condition that 4s 2 d e Z. Since d is 
square-free it follows that 2s e Z. Set 2r = m and 2s = n. Then, r 2 — ds 2 e Z 
implies m 2 — dn 2 : 三 0 ⑷. 

Recall that a square is congruent to either 0 or 1 modulo 4. 

If d = 2, 3 (4) then m 2 — dn 2 = m 2 -f 2n 2 or m 2 -f n 2 (4). The only way 
that m 2 + In 2 or m 2 -f n 2 can be divisible by 4 is for both m and n to be even. 
This is the case iff r and s are in Z. This establishes the first assertion. 

If J = 1 (4) then m 2 — dn 2 is congruent to m 2 — n 2 modulo 4. But 
m 2 — n 2 = 0 (4) ifT m and n have the same parity, i.e., they are either both odd 
or both even. Thus D = {(m -f ny/d)/2\m = n (2)}. Notice 

m + riy/d m n / — 1 

2 — ~~1 — 2 / 

Since m = n (2), (m -h n)/2 e Z. Thus Dc：Z + Z(— 1 + s/d)/2. To 

establish the reverse inequality we simply notice that (— 1 + y/d)/2 e D since 
d = l (4). □ 


We can now calculate the discriminant of quadratic number fields. 


Proposition 13.1.2. Let S F denote the discriminant of F• 
If d 三 2, 3 (4) then d F = 4d, 

Ifd = 1 (4) then d F = d. 

Proof. If d 三 2, 3 (4) set co j = 1 and (o 2 — x /d. Then 


(tiCDiOJj))= 




Thus S F = det(f(co I co J )) = 4d. 

If d = 1 (4) set co x = 1 and co 2 

(tiOJiCOj))= 





(—1 + y/d)/2. Then 

J d；U 


Thus 3 f = det^oJiCOj)) = d. 
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Having investigated D and 5 F we now want to determine how rational 
primes peZ split in D. From Theorem 3' of Chapter 12 we know efg = 2, 
so we have three cases; e = 2, f = 1 ，分 =1 or e = 1， / = 1 ，分 = 2, or e = 1 ， 
f = 2.g = :1. We say, respectively, that p ramifies, splits (decomposes), or is 
inertial (remains prime). 

If p is a prime in Z let P be a prime ideal in D containing p. Let P = 
{/lyeP}. 

Proposition 13.1.3. Suppose p is odd. 

(i) If pJfS F and x 2 = d (p) is solvable in Z then (p) = PP\ P # P\ 

(ii) If pX d F and x 2 = d (p) is not solvable in Z then (p) = P. 

(iii) Ifp\S F then (p) = P 2 . 

Proof. In case (i) suppose a 2 = d (p) with aeZ. We claim that (p )= 
(p, a + ^/d)(p 9 a - ^/d). In fact, (p, a -f y/d)(p, a - Jd) = (p)(p, a -h Jd, 

a — y/d, (a 2 — d)/p). The latter ideal is D since it contains p and 2a and these 
two numbers are relatively prime. We claim (p, a -h ^Jd) ^ (p, a — y/d). 
If equality held then the ideal would contain p and 2a and so would equal D 
and it would follow that (p) = D. Thus p splits as asserted. 

In case (ii) we claim P has degree 2. If degree Pis 1 then D/P hasp elements. 
Since Z/pZ injects into D/P it would follow that every coset of D/P is repre¬ 
sented by a rational integer. Let aeZ be such that a = J~d (P). Then a 2 = 
d (P) and a 2 = d{p) contrary to assumption. Thus p remains prime as 
asserted. 

Finally, in case (iii) we claim (p) = {p, yjd) 1 - In fact, (p, ^/d) 2 = (p) 

(p, -J~d, d/p). The latter ideal is D since p and J/p are relatively prime (re¬ 
member that d is square-free). Thus p ramifies as asserted. □ 

We now discuss the decomposition of the prime p = 2. Remember that by 
Proposition 13.1.2 we have 2)(d F if and only if d = l (4). 

Proposition 13.1.4. Suppose p = 2. 

(i) IflJ^Sp and d = l (S) then (2) = PP f and P # F. 

(ii) If2^d F and d = 5 (S) then (2) = P. 

(iii) If2\3 F then (2) = P 2 . 

Proof. If d = l (8) we claim that (2) = (2,（1 + y/d)/2) (2, (1 — y/d)/2). In 

fact (2, (1-f Vd)/2)(2, (1 - V5)/2) = (2)(2,（1 + Jd)/2, (1 - 抽 X 

(1 — d)/S). The latter ideal is D since it contains 1 = (1 + y/d)/2 -h 

(1 - y/d)/2. Moreover, (2,（1 + Jd)/2) # (2,（1 — Jd)/2) since otherwise 
the ideal contains 1 and it would follow that (2) = D. 

If d = 5 (8) we claim P has degree 2. If not (as in part (ii) of the last pro¬ 
position) there is an integer aeZ such that a 三 （1 + y/d)/2 (P). Since 
(1 + ^/d)/2 satisfies x 2 — x -f (1 — d)/4 = 0 we would have a 2 — a - 
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(1 — d)/A = 0 (P) and so a 2 — a + (1 — d)/4 = 0 (2). For all aeZ, a 2 — a is 
even. It follows that (1 — d)/4 = 0 (2) or = 1 (8) contrary to assumption. 
Now suppose 2\d F . We must have d = 2, 3 (4). If d = 2 (4) then (2)= 

(2, sfd) 1 and if d 三 3 (4) then (2) = (2, 1 + x /rf) 2 . We leave the simple 
verification to the reader. □ 

We note that we can state the decomposition law for odd primes in a 
succint manner using the Legendre symbol. Namely, if (S F /p) = 1 then p 
splits, if (S F /p) = — 1 then p remains prime, and if (S F /p) = 0 then p ramifies. 
Furthermore the decomposition of p, p odd, depends only on the residue 
class of p modulo S F . For if J = 2 or 3 modulo 4 then S F = Ad and the result 
follows from Proposition 5.3.3 and Exercise 37 of Chapter 5. If J = 1 (4) then 
we may argue as follows. Since J = 1 (4) we have S F = d. Thus 

The value of (p/S F ) depends only on the residue class of p modulo d F . 

Next we determine the structure of the group of units in D. It is simple to 
see that a is a unit iff N(a) = 土 1. Consider first the case of an imaginary 
quadratic field, so that d <0. Let U d denote the group of units in D. 

Proposition 13.1.5. Ifd < 0 and square free then 

(a) C/-! = {U -1 ， 一 /}• _ 

(b) C/_ 3 = { 土 1 ， ±co, ±co 2 }， where co = (—1 + ^^3)/2. 

(c) U d = {1, — 1} for d < —3, or d = — 2. 

Proof. If d 三 2 or 3 (4) then any unit may be written in the form x -h y/dy, 
x, yeZ. Thus AT(a) = 土 1 is equivalent to x 2 + \d\y 2 = 1. If d = — 1 we 
obtain (a). lf\d\ > 1 then clearly \J d = { + 1 ，一 1}. 

If d = 1 (4) write = (x + ^fdy)/2 where x = y (2). Then N(oc) = 士 1 is 
equivalent to x 2 -f \d\y 2 — 4. If d = —3 the solutions to x 2 -h 3y 2 = 4 give 
part (b) while if \d\ > 3 the equation x 2 + \d\y 2 = 4 clearly gives U d = 
{-hi, — 1}. This completes the proof. □ 

Thus the determination of the unit group is quite simple in the imaginary 
case. The case of a real quadratic field is considerably more difficult. 

If d > 0 and square-free the equation x 2 — dy 2 = 1 is called Pell’s 
equation. In Chapter 17, Section 5 it is shown that this equation has a solution 
in nonzero integers x, y. The proof is elementary. Assuming this result we 
describe the units in D in the real quadratic case. 

Proposition 13.1.6. If D is the ring of integers in Q(y/d), d > 0 then there exists 
a unit w > 1 such that every unit is of the form 士 w m ，斤 r e Z. 

Proof. By Proposition 17.5.2 there exist positive nonzero integers x, y such 
that x 2 — dy 2 = +1. Thus x + y/dy = w is a unit in D,u > 1. Let M be a 
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fixed real number, M > w. By Exercise 4, Chapter 12 there are at most a 
finite number of a e D with | a | < M, | oc' | < M where a' is the conjugate of a. 
If is a unit 1 < p < M then N(p) = ^ = ±1. Iff = - 1/p then -M < 
—1/jS < M and if)?' = 1/jS then also -M < 1/j? < M. Thus here are only 
finitely many units p with 1 <P< M and there is at least one, viz., u. Let s 
be the smallest positive unit e > 1. If t is any positive unit then there is a 
unique integer s (not necessarily positive) with e s < t < s s+1 . Then 1 < 
ts~ s < s and since is a unit we have ts~ s = 1. If t is negative then — t is 
positive and — t = e s . This completes the proof. □ 

The unique unit e defined in Proposition 13.1.6 is called the fundamental 

unit of ©(y/d). The set of J > 0 for which the norm of e is — 1 has not been 
determined. However there are many interesting results in that direction (see 
[196], pp. 124-126). It has been conjectured that for d = p, p = i (4) and 

prime, and c = (w + v^/p)/2 that p\v [86]. The fundamental unit, even for 
small discriminants, can be difficult to compute. For example, the fundamental 

unit of Q(^94) is 2143295 + 221064^/94. 

These results on units are special cases of the important Dirichlet unit 
theorem which gives the structure of the group of units in an arbitrary 
number field. This theorem states that the group of units modulo the sub¬ 
group of roots of unity in the field is a finitely generated group with r + s — 1 
generators, where s is the number of pairs of complex conjugate roots and r is 
the number of real roots of a generator for the field. In the case of quadratic 
fields this number is clearly 0 or 1 according as the field is imaginary or real, 
which agrees with the above results. 

As regards the class number there is an exceedingly rich theory for quad¬ 
ratic number fields. In fact there exist explicit formulas, discovered by 
Dirichlet. We give a particularly elegant special case. Suppose q > 3 is a 

prime and q 三 3 (4). Let F = 0(^ — q). Let V and R represent the sum of the 
quadratic nonresidues and quadratic residues modulo q, respectively, among 
the numbers 1 ， 2, 3 ,…， q — l. Then h F = (l/q)(V — R). 

For example, let q = 1. Then K = 3 + 5-f6=14 and 尺 =1 + 2 + 4 
= 7. Thus h F = 4(14 — 7) = 1. 

If we restrict our attention to d < 0 then C. L. Siegel proved that In 
/i F /ln|(5 F | 1/2 1 as |5 F | -> oo. It follows that there are at most finitely many 

d < 0 for which Q(^—d) has class number below a fixed bound. _ 

Gauss conjectured that the only d for which the class number of Q(y/ — d) 
is 1 are d = —1, —2, —3 ， 一 7 ， 一 11 ，一 19 ， —43, —67, and — 163. The first 
generally accepted proof was provided by H. Stark. In essence a proof had 
been given earlier by K. Heegner, but because of obscurities in the exposition 
his proof was at first not thought to be valid. 

For positive d, Gauss conjectured that infinitely many of the fields Q(^/d) 
have class number 1. This, however, remains an open problem. 

A beautiful formula that determines the class number of a real quadratic 
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field of discriminant p, p a prime congruent to 1 modulo 4, is s h = 
f] (sin(nj/p)y xij) where e is the fundamental unit, x is the Legendre symbol, 
and the product is over the numbers j = l,(p — 1)/2. A similar formula 
holds for arbitrary discriminant. For these results and their proofs see 
Borevich and Shafarevich [9], Chapter 5. 

We conclude this section by mentioning several other results whose proofs 
are beyond the scope of an elementary treatment. Consider an imaginary 
quadratic field of discriminant d. Then the class number of this field is 
divisible by 2 卜 1 where t is the number of distinct prime divisors of d. Thus the 

class number of 0(^/ —210) is divisible by 8. It turns out that the class 
number is exactly 8. A similar result holds for real quadratic number fields. 

The following most remarkable fact has been discovered by F. Hirzebruch. 
Let p be a prime congruent to 3 modulo 4 and assume that the class number of 

Q(V^) is one. Then the class number of the imaginary quadratic field 
— p) is one third of the alternating sum a s — a s _ x -f a s _ 2 — • ••士 〜， 
where the continued fraction of ^/p is, in the standard notation, 
(a 0 , a l9 a 2 , ..., a s \ (see Stark [73], Chapter 7). For example, both Q( x / / 67) 
and Q (- >/—67) have class number one and 

# = (8, 5, 2, U ， 7, 1 ， 1 ， 2, 5, 16). 

§2 Cyclotomic Fields 

Let m be a positive integer and = e 2nl/m . The number C m satisfied x m — l 
= 0 as do all the powers of C m . Thus, we have x m — 1 = (x — l)(x — Cm) • • * 
(x — ^ follows that the field F = Q(C m ) is the splitting field of the 

polynomial x m — 1. Thus F/Q is a Galois extension. 

We call F = Q(^ m ) the cyclotomic field of mth roots of unity. It was first 
studied by Gauss in connection with his investigations into the construct¬ 
ability of regular polygons (see Chapter 9, Section 11). 

Proposition 13.2.1. Let G be the Galois group of F/Q. There is a monomorphism 
0: G -> U(Z/mZ) such that for ae G 

< =cr. 

Proof. Since ^ = 1 we have (crC m ) m = 1. Thus = Cm a) where 9(a) is an 
integer modulo m. If x = o~ l then Cm = T(7 C m = T C) = Cm X)0{<T) - Thus 
0(t)9((t) = I (where I is the coset of 1 in Z/mZ). Thus 6: G U(Z/mZ). It is 
easily checked that 0 is a homomorphism. Finally, if 6(a) = I then 
implying a is the identity of G since <^ m generates F over Q. □ 

Corollary. [Q(( m ) : Q] divides 

We will show later that in fact [Q(( m ) : Q] = </>(m). 
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Definition. Let 0> m (x) =「[(—=1 O — Cm) where 1 < a < m. This polynomial 
is called the mth cyclotomic polynomial. 

The roots of (I) m (x) are precisely the primitive mth roots of unity, i.e., those 
mth roots of unity of order m. Clearly the degree of O m (x) is (p(m). 

Proposition 13.2.2. x m — 1 = f| d/m O d (x). 

Proof. 

# _ 1 = n ( x _ c) = n n ( x _ 

i = 0 d/m (i ， m) = d 

We claim (x - Cm) = ❿ m/dOO. The proposition will follow from 

this. 

If ( 1 , m) = d, let i = dj. Then CL = Cm — CLid. Moreover, (j,m/d) = 1. Thus 

n ( x — ^rn) = n ( X — Cm/d )= ❿ m/dW. 口 

(i, m) = d (j, m/d) = 1 

Corollary. e Z[x]. 

Proof. We proceed by induction on m. = x — 1. Now suppose the 
corollary has been established for integers less than m. By the proposition, 
<5 m (x) = (x m — 1 )//(%)，where f(x) is a monic polynomial which by the 
induction hypothesis is in Z[x]. It follows by “long division” that $ m (x )e 
Z[x]. □ 

An alternate proof of the corollary goes as follows. Every geG permutes 
the primitive mth roots of unity. Thus the coefficients of O m (x) are left fixed by 
G and so are in (Q. Since they are clearly algebraic integers they must be in Z. 
From now on we write C m = C,F = Q(C), and D for the ring of integers in F. 

Proposition 13.2.3. Suppose p is a rational prime and pjfm. Let P be a prime 
ideal in D containing p. Then the cosets of 1, C ， C 2 ,.. • ， C m_1 ^ D/P are all 
distinct. If f denotes the degree of P then p f = l (m). 

Proof. For wg/) let vv denote its coset in D/P. 

Divide both sides of x m — 1 = f] (x — C l ) by x — 1. We find 

m — 1 

i + x + … + x m_i = n (^ - C). 

i= 1 

Let x = 1 in this identity. We find m rid — C) where 1 < i < m — 1. 

Thus m = (1 — C). Since m / 0 it follows that # T for 1 < i < m — 1, 

and so C # C J for 0 < i, j < m - 1. 

The elements {C l |0 < i < m — 1} form a subgroup of order m in the 
multiplicative group of D/P. The latter group has order p f — 1. Therefore 
p f = 1 (m). □ 
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Theorem 1. The mth cyclotomic polynomial, is irreducible in Z[x]. 

Proof. Let f(x) e Z[x] be the monic irreducible polynomial for C. The fact 
that f(x) has coefficients in Z follows from the fact that C is an algebraic 
integer (Exercise 16, Chapter 6). If p 木 m is a prime we will show that is also a 
root of f(x). If a g Z, and (a, m) = 1, then by factoring a into a product of 
primes it will follow that is a root of / (x). Thus deg / (x) > </>(m). On the 
other hand, since ^> m (0 = 0, /(x) divides O m (x) which has degree (/>(m). It 
will then follow that f(x) = <X> m (x). 

Now, let p be a prime, pj^m, and let P be a prime ideal of D containing p. As 
usual, ifweD then vv will denote the residue class of w in D/P. We have x m — 1 
=f (x)g(x) and so x m — I = f(x)g(x) in Z/pZ[x]. By the last proposition 
x m — T has distinct roots in D/P. It follows that f(x) and g(x) have no com¬ 
mon root. Suppose / (( p ) # 0. Then g(C p ) = 0 and g{l p ) = 0. The coefficients 
of g(x) are in Z/pZ and are thus equal to their own pth power. From this we 
see 0 = g(C p ) = g(0 p and so 0 = g{l). It follows that /(C) ^ 0 which is not 
true because /(C) = 0. One concludes / (C p ) = 0 as asserted.. □ 

Corollary 1. [Q(C m ) : Q] = (Km). 

Corollary 2. The map 6 of Proposition 13.2.1 is an isomorphism of G onto 
U(Z/mZ). 

Proof. Both G and U (Z/mZ) have 0(m) elements. Since 6 is one-to-one it must 
be onto. □ 

By Corollary 2 we see that for every aeZ with (a, m) = 1 there is a 
cr a eG such that a a C = C a - The map a-^ a a gives rise to a homomorphism 
from U(Z/mZ) to G which is inverse to 9. 

If p is a prime, pj^m, we wish to study more closely the automorphism a p . 
Before we do so, some preliminary work is needed. 

Lemma 1. Let F/Q be an algebraic number field of degree n. Let D a F be the 
ring of integers and a 1? a 2 ,..., e D a field basis for F/Q. Let A = A(a 1? 
a 2 ,.. •, a n ) be the discriminant of this basis. Then AD cz Za x -f Za 2 + • • • 
+ Zoc w . 

Proof. Let we D. We have vv = [ n a i with r t e Q. Multiply both sides by (Xj 
and take the trace. We find f(wa J ) = ^ The elements t(w(Xj) and 

t(<Xi(Xj) are all in Z since they are traces of algebraic integers. Using Cramer’s 
rule to solve for the r, we see that each r t is an integer divided by A. The 
result follows. □ 

Lemma 2. The discriminant A = A(l, C, …， C^ (m)_1 ) divides m 小 ㈣. 

Proof. Differentiate both sides of — 1 = ^ m (x)g(x). We find mx nt ~ 1 = 
(S> f m (x)g(x) + Substitute x = C- The result is m( m_1 = Q> f m (0g(0- 
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Now take the norm of both sides. Using Proposition 12.1.4 and the fact that 
AT(Q = ±1 we find 士 m 々 (m) = AN(g(C)). We note by Theorem 1, that, 1, 
〔， •.. ， C^ (m)_1 is a field basis for Q(0/Q so that A(l, C，• • • ， C^ (m)_1 ) # 0. □ 

Proposition 13.2.4. Let p e Z be a prime such that p )( m. Let we D the ring of 
integers in Q(Q. There is an element a t C e Z[C] such that w 三 [a f C l (p)- 

Proof. Let A = A(l, (,•••, C^ (m)_1 ). By Lemma 2, p 氺 A. Thus there is a 
A’ e Z such that A'A = 1 (p). Thus w 三 A’Avv (p). By Lemma 1, Awe Z[C]. 
Thus the result. □ 

We remark that in fact D = Z[Q but this is not so easy to prove for general 
m. When m is a prime however, the proof is reasonably easy (see Proposition 
13.2.10). 

Corollary. Suppose pjf'm and n > Ois such that p” 三 1 (m). Then,for we D\^e 
have w pn = w (p). 

Proof. By the proposition, w 三 [ a t C (p) with the 屮 e Z. Since af = a i (p) we 
must have 三 Z a^ 1 (p). Repeating this process n times and using the fact 
that p n = 1 (m) implies C pn = C yields the result. □ 

Proposition 13.2.5. If p is a prime and p)(m the every prime ideal P in D 
containing p is unramified. 

Proof. Assume P is ramified. Then (p) c P 2 . Let w be an element of P not in 
P 2 . By the above corollary = w (p) and so w pn = w (F 2 ). Since p n > lit 
follows that w e F 2 , a contradiction. □ 

We will see later that the converse of this proposition is “almost” true. 
See Proposition 13.2.8. 

Recall that, for p prime, pj(m the automorphism a p sends C to 

Proposition 13.2.6. For all we D we have a p w = w p (p). 

Proof. By Proposition 13.2.4 we have w 三 [ (p). Apply a p to both sides. 
We find that or p vv 三 [ C pi (p). Since the e Z we have 丈 a^ pi = ^ af^ pi = 
([ a iCY (p)- Thus cr p w = w p (p) as asserted. □ 

Corollary. Let P be a prime ideal of D containing p. Then a p P = P. 

Proof. If weF then a p w = w p = 0 (P) and so a p P cz P. Since a p P is a 
maximal ideal we have equality. □ 

Theorem 2. Let p be a prime, p)(m. Let f be the smallest positive integer such 
that p f = 1 (m). Then in D [ Q(C) we have 

{p) = P l P 2 -P g , 

where each P t has degree f and g = (j>(m )//. 
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Proof. We first observe that it follows directly from the definition that / is the 
order of the automorphism o p . 

Now, = \D/P t I where f x is the degree of Since D/P 1 is a finite field 
we have w pjl = w (P x ) for all w e i) and / x is the smallest positive integer with 
this property. 

By the last proposition, we have w = o^(vv ) 三 w p/ (P 1 ) for all w e D. It 
follows that/ x < /. 

On the other hand, C p/l = C (P^ implies C p/l = C by Proposition 13.2.3. 
Thus p fl = 1 (m) and it follows that / < f v 

We now see f = fi = degree of P v All the P t have degree /. By Proposi¬ 
tion 13.2.5 all the P t are unramified. Using the relation efg = (j>(m) we con¬ 
clude g = (j){m)/ /. □ 

Corollary. With the notation of the theorem, let P be one of the P t . Define 
G(P) = {a eG\oP = P}. Then G(P) is a cyclic group generated by a p . 

Proof. By the corollary to Proposition 13.2.6 we know a p e G(P). Let 〈 tx p 〉 
be the cyclic group generated by a p . Then <cr p ) c ： G(P). By Proposition 
12.3.3 we have g\G(P)\ = 0(m). Thus |G(P)| = (j){yn)/g = f = |<cr p )| and 
we are done. □ 

Theorem 2 is a very satisfactory result on the decomposition of primes 
which do not divide m. One can also find the decomposition of those primes 
which do divide m. We content ourselves with the following important special 
case. 

Proposition 13.2.7. Let l be a prime in Z. Then，in Q(Ci\ I ramifies completely. 
More precisely, let L = (1 一 Cj). Then L is a prime ideal and (/) = L 1 — 1 . 
Moreover L has degree 1. 

Proof. As in the proof of Proposition 13.2.3 we have l = (1 — C|) where the 

product is over 1 < i < / — 1. 

Let u i = (1 — 0/(1 — C) = 1 + C + … + C l_ 、 We claim that u t is a unit. 
Since \)(i there is a) e Z such that ij = 1 (/). Thus, 1 = (1 — Q/(l — C)= 
(1 一 C lJ )/(l — C l ) = 1 + C l + … + (C) j l is an algebraic integer which 
proves the claim. 

It follows that / = (1 — C l ) = (1 — Cy 1 n u i and so (/) = L 1 - 1 . Using 

the relation efg = (j){l) = / — 1 we see L must be prime, e = / — 1, = 1, 

and / = 1. □ 

Proposition 13.2.8. Let P be a prime ideal in Q(C m ) and set P n Z = pZ. Ifp is 
odd then P is ramified iffp\m. Ifp = 2 then P is ramified iff4\m. 

Proof. By Proposition 13.2.5 we know that p 氺 m implies P is unramified. 

Suppose p is odd and p | m. Then Q(( p ) c ： Q(^ m ). Let D p and D m be the rings 
of integers in Q(Q and Q(C m ) respectively. By the last proposition pD p = 
(1 — Cp) p ~ l - Write (1 — C p )D m = PiP 2 • • P t where the P t are, not necessarily 
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distinct, prime ideals in D m . Then pD m = (/\ 户 2 … Pt) p l - Since p — 1 > 1 
all the primes in D m containing p are ramified. 

Now suppose p = 2. If 2|m but 4 氺 m then m = 2m 0 , with m 0 odd. In this 
case, — C mo is a primitive mth root of unity so Q(C m ) = Q(Cm。). Since 2|m 0 , 
P is unramified. 

Finally, suppose p = 2 and 4|m. Then C 4 = — 1 = i e Q(C m ). Since 

(1 — i) 2 = — 2i we see 2D m = ((1 — i)D m ) 2 and it follows, as before, that all 
the primes in D m containing 2 are ramified. □ 

Suppose p is a prime and For later use (in the next chapter) we need 
to know how p decomposes in the field Q(C P , Cm). 

Lemma 3. //(m, n) = 1 then Q(C m ， C„) = Q(C m J- 

Proof. Since Cl = L and CL = L we have Q(C m? O ^ Q(Cm.)- 

On the other hand, since (m, n) = 1 there exist integers u and v such that 
um + vn= 1. Thus ( 細 = O CHO Q(C m ， C„). □ 

Proposition 13.2.9. Let p be a prime such that p\m. Let D be the ring of 
integers in Q(C P ， Cm). Then 

P D = (p 1 p 2 --p g y-\ 

where the are distinct prime ideals of degree f andg — 0(m)/ /. The integer f 
is the least positive integer such that p f =\ (m). 

Proof. Since Q(C P ) ^ Q(C P? Cm) we see, as in the proof of the last proposition, 
that all the ramification indices of primes in D containing p are divisible by 
p — 1. Thus 

pD = (p l p 2 --p g r^ p - l) w 

where the P t are distinct prime ideals of degree f, say, and e' > 1 is some 
integer. 

Let D m be the ring of integers in By Theorem 2 

pD m = P 1 P 2 -P g 

where the P, are prime ideals in D m of degree / and g : = 沴 ⑽ //• 

By considering the prime decomposition of P t D and comparing with 
equation (*) we see f>f and g f > g. 

From equation (*) and Lemma 3 we see 

(P - 1) 伽 ） == e\p - \)fg' > e\p - 1)/ 

It follows that </>(m) > e f cj)(m). Thus e' = \ and all the inequalities are 
equalities, i.e., f — f and g f = g = cj)(m)/ f. This concludes the proof. □ 


We conclude this section by showing that D = Z[CJ when / is prime. This 
result holds even when l is not prime but the proof is more difficult (see, for 
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example, pp. 265-268, [207]). The case when / is prime will be needed in 
Chapter 17 where a special case of Fermat’s conjecture is discussed. 

Proposition 13.2.10. If l is prime then D = Z[CJ. 

Proof. Clearly Z[CJ <= D. \{ oleD there exist a 0 , a u a t - 2 rational 
numbers such that a = a 0 4- + • • • 4- a 卜 2 C f _2 . We show first of all that 

hi e Z，i = 0,…， l — 2. For if tr denotes the trace map from Q(Q to Q then 
one computes easily tr = — 1 if / 卞 j，using say, Corollary 1 of Theorem 1. 
Thus one sees that tr(a( 一 s ) = —a 0 _〜_•••_ 七—丄 + (/ _ l)a s _ a s+1 
—...—a f _ 2 . Therefore tr(aC _s — aQ = la s ，s = 0,…， l — 2. Since aC~ s — 
a^eD it follows that /a s e Z. If 义 =1 — C then by Proposition 13.2.7 one has 
( 又 ) 卜 1 = (Z). By the above there exist fc 0 ,. • • ， 6 卜 2 in Z such that /a = 
b 0 + b^X + • • • + 6 z _ 2 ^ -2 . Thus X\b 0 and taking norms shows that l\b 0 . 
Thus |fc 0 and reduction modulo A 2 given X 2 \b 1 X so that >116^ Again this 
implies l\b v Clearly, successive reduction modulo higher powers of X leads to 
l \bjJ = : 0, •••，/ — 2 and division by l then shows that a e Z[(J. □ 


§3 Quadratic Reciprocity Revisited 


As an application of some of the theory developed in this chapter we give yet 
another proof of quadratic reciprocity. The idea for this proof goes back, in 
essence, to Kronecker. 

Let p be an odd prime and consider the field Q(C P ). We claim that this field 
contains the square root of ( — l)( p — l)/2 p = p*. This follows from Proposition 
6.3.2. However, in order to make our present considerations independent of 
the theory of Gauss sums, we give a direct proof using the relation 

P = nV - c). 

i=i 

We combine the terms corresponding to i and p — i as follows 

(i - co(i - d = (i - 0(1 - ro = -r i '(i - o 2 . 

Thus 

(P~ 1)/2 (jy — 1 ) 

p = (—l) (p ~ 1)/2 C b n (1 — C 1 ) 2 where b = — 1 — 2 — • • - --—— • 

i=i ^ 

Let ceZ be such that 2c = 1 (p). Then C b = (C bc ) 2 - It follows that p* is a 
square in Q(Q as asserted. Let t 2 = p*. 

Now suppose q is an odd prime q ^ p. Consider the automorphism <7 q . 
Then a q x = 士 t with the plus sign holding iff a q is in the Galois group of 
Q(C p )/Q(t). Since the Galois group G of Q(C)/Q is isomorphic via 6 to 
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U(Z/pZ) and the latter group is cyclic of order p — 1 we see = t iff is a 
square in G and this is so iff ^ is a square in U(Z/pZ). In other words 



Let Q be a prime ideal inD a Q(Q containing q. By Proposition 13.2.6 we 
have 

V 三 f (Q). 

Thus (q/p)i = x q (Q) implying (p*/q) = p * iq ~ 1)/2 = x q ~ l = (q/p) (Q). 
This latter congruence implies (p*/q) = (q/p) since Q does not contain 2. 
It may be thought that this proof, pretty as it is, is much more complicated 
than the previous proofs and so does not add much. This is not the case, 
because the ideas involved provide the key to studying higher reciprocity 
laws. 

Notes 

There is an introduction to the arithmetic of quadratic number fields in 
J. Sommer’s Introduction a la Theorie des Nombres Algebriques (Hermann : 
Paris, 1911). This book is based upon D. Hilbert’s lectures in 1897-1898. See 
also F. Chatelet [111], W. Adams and L. Goldstein [84], and H. Stark [73]. 

As mentioned earlier all imaginary quadratic fields whose ring of integers 
form a unique factorization domain have been determined. The imaginary 
quadratic fields of class number two have also been determined. There are 18 

such fields, the one with smallest discriminant being 0(^/—427). 

In the case of cyclotomic fields Masley has shown that if m is a positive 
integer, m 丰 1 (4), then there are exactly 29 values of m for which Q(C m ) has 
class number one. Furthermore, the prime cyclotomic fields Q(C P ) of class 
number one are given by p = 3, 5, 7, 11, 13, 17, 19 a result due to Uchida and 
Montgomery. For more details see the surveys by Masley [184], [185]. 

For a more thorough treatment of the arithmetic of quadratic and 
cyclotomic number fields the reader should consult the treatise of Borevich 
and Shafarevich [9]. 

In Section 3, we saw that Q(y/(—l) ip ~ 1)/2 p) is a subfield of Q(C P ). More 
generally, according to a theorem of Kronecker and Weber any algebraic 
number field which is Galois with an abelian Galois group is a subfield of 
Q(C m ) for some m. Fora proof of this difficult theorem see P. Ribenboim [207]. 

Exercises 

1. Show that an algebraic number field of odd degree cannot contain a primitive nth 
root of unity n > 2. 

2. Let F be a real quadratic field. Show that if F has an element of norm — 1 then no 
prime p = 3 (4) is ramified. 
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3. Prove that if F is an algebraic number field such that e 2ni/n e F for some n > 3 then 
the norm of any nonzero element of F is positive. 

4. Find the fundamental unit for 0(^/5), 0(^/15), 0(^/2), 0(^/3), Q( v / / 624). 

5. Show that a quadratic number field cannot contain and ^fq for two distinct 
primes p and q. 

6. List the subfields of Q(C 8 ). 

7. Let F be a real quadratic field. Show that there are algebraic integers in F arbitrarily 
close to 1 and distant from 1. 

8. Show that the class number of 0(^^10) is not 1. 

9. Let p be an odd prime and consider Q(C P )- 

(a) Show that N(1 + Q = 1 where N denotes the norm fromQ(C p ) toQ. 

(b) Show that Y[ (1 + C s ) = A ， the product being over the squares modulo p, is in 

Q(V^). 

(c) If P 三 1 (4), show that A = (t -h with t = u (2). 

(d) Conclude from (a) that {{t 2 — pu 2 )/4) ip ~ 1)/2 = +1 so that 

(e) t 2 — pu 2 = ±4. 

(f) Show that A ^ — 1 by showing that ^ > 0 (compare Exercise 3). 

Now let p = 5 (8). 

(g) Show that A ^ \ by considering the polynomial f\ s (1 + x s ) — 1, 

s = l 2 , 2 2 ,..., ((p — l)/2) 2 . (See also Exercise 9, Chapter 16.) This exercise is 
adapted from Hartung [145]. 

10. For which d does Q(^/d) have an integral basis of the form a, <x' where a' is the 
conjugate of a? 

11. Show that — (C 3 + C 2 ) is a unit in Q(C), C = e 2ni/5 . What is the relation between this 
unit and the units in Q( v // 5)? 

12. Show that sin(7y7p)/sin(7i/p) is a unit in Q(C P X 1 < j < p — 1. 

13. Show that if p = 1 (4), p prime, then the ring of integers in Q(C P ) always contains an 
infinite number of units. 

14. Let p be prime. Show that the discriminant A of Q(C P ) is Y\i<j (C — C J ) 2 , 1 < U 
7 < P - 1. 

15. The notation being as in Exercise 14 show 

(a) -pC~ j /(\ - CO = fl (C 7 - C^the product over all i,j, i # 1 < ij < p - 1. 

(b) Multiply for j = 1 ， 2, ... ， p — 1 to obtain A = ( — l) (p_ 1 )/ 2 p p ~ 2 . 

16. Use Proposition 13.2.8 to show that i ^ Q(C P ), p odd. 

17. Use Propositions 13.2.7 and 13.2.8 to show that $ Q(C P ) if p and q are odd primes 

P 妾 q. 

18. Show that if p is a prime congruent to 3 modulo 4 then Q(y/p) is contained in the 
cyclotomic field Q(C 4p ). 

19. Show that any quadratic number field is contained in a cyclotomic field. 
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20. Show that the fundamental unit of the real quadratic field 0(^/10) is 3 + ^/lO and 
using the formula given in the text determine the class number of the field. 


21. Let a e Z, a not a square, a = 0(4) or a = 1 (4). Define the Kronecker symbol x a 
as follows \lip\a, x a (p) — 0. If p | a, is an odd prime then x a (p) = (a/p) the Legendre 
symbol; Xa(2) = Ufa = \ (8), x a {2) = - 1 if a = 5 (8). Finally x a (b) = xM 
if 土办 =Pi • • • A. Show 

(a) For b odd x a coincides with the Jacobi symbol. 

(b) Ifb > 0,(<3, b) = l,a = 2V withe odd then= XiQ ) ) t Xb( c X~ l) ((c_ 1)/2)((b ~ 1)/2) 
(C) Xa(x) = Xa(y) if X = y (a). 


22. Let K be a quadratic number field with discriminant d, and let 心 be the Kronecker 
symbol. Show, for p any prime, 

(a) p splits in K iff x d (p) = 1- 

(b) p is inertial iff xAp) = — 1. 

(c) p ramifies iff Xdip) — 0. 

23. Using the table in Stark [73], p. 340, along with the tables in Borevich-Shafarevich 
[9], pp. 422-425 verify the Hirzebruch formula stated at the end of Section 1 for 
the primes 7, 19, 23, 31,43, 47, 67, 83. Furthermore check the class numbers for the 
imaginary quadratic fields using Dirichlet’s formula. Show that, knowing the class 

number of 0(^/— 91) to be 2, Q( V /9T) is not a principal ideal ring. 

24. Let K be the field of pth roots of unity, p an odd prime. Show, without using Gauss 
sums, that the unique quadratic subfield of K has discriminant ( — l) (p_ l 、 l2 p. 

25. The situation being as in the preceding problem, let /be the order of q modulo p, 
p ， for an odd prime q ^ p. If E denotes the quadratic subfield of K show that q 
splits in E iff £ is contained in the subfield D of degree (p — 1)//. Show furthermore 
that this is the case iff q is a square modulo p. Using the preceding exercise derive 
a new proof of the law of quadratic reciprocity. 


26. Count the number of proofs to the law of quadratic reciprocity given thus far in this 
book and devise another one. 

27. Show that there are no primes which remain prime inQ(C 8 ). Can you generalize? 



Chapter 14 


The Stickelberger Relation and 
the Eisenstein Reciprocity Law 


Having developed the basic properties of cyclotomic fields 
we will prove two beautiful and important theorems which 
play a fundamental role in the further development of the 
theory of these fields. 

The Eisenstein reciprocity law generalizes some of our 
previous work on quadratic and cubic reciprocity. It lies 
midway between these special cases and the more general 
reciprocity laws investigated by Kummer and Hilbert, 
proven first by Furtwangler and then in full generality by 
Art in and Hasse. In the last section of this chapter we will 
give two interesting applications of Eisenstein’s result. 
The first concerns Fermat's Last Theorem and the second 
the theory of power residues. 

The Stickelberger relation is the basis for the proof 
we give of Eisenstein reciprocity. Its importance goes far 
beyond that. In recent years the theory of cyclotomic 
fields has been dramatically advanced principally due to the 
efforts of K. Iwasawa. In his work the Stickelberger 
relation occupies a central position. It has also turned out 
to be of importance in arithmetic algebraic geometry. 


§1 The Norm of an Ideal 

We will need a few more results from the general theory of algebraic number 
fields. 

Let K/Q be an algebraic number field, D the ring of integers in K, and 
A an ideal. We define N(A\ the norm of A, to be the number of elements in 
D/A. We continue to assume that ideals are nonzero. 

Proposition 14.1.1. If A, B a D are ideals，then N(AB) = N(A)N(B). 

Proof. If A and B are relatively prime, then D/AB ^ DjA® D/B so the 
assertion is clear in this case. 

Let A = P^P^ 2 • • Pf 1 be the prime decomposition of A. We claim 
N(A) = (N(P 1 )) ai (N(P 2 )) a2 - - - (N(P t )) at . On the basis of what has been 
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said it will be sufficient to prove N(P a ) = (N(P)) a for any prime ideal P. 
This, however, is just a reformulation of Proposition 12.3.2. 

Now, in the general case, decompose A and B into a product of prime 
ideals, multiply, apply the above result, and rearrange terms. The result 
follows. □ 

Proposition 14.1.2. Suppose K/Q is a Galois extension with group G. Then 

n 咖） =w ⑷). 

<re G 

Proof. Since both sides are multiplicative in A it suffices to prove the result 
when 4 is a prime ideal P. 

Let P u P 2 , • • • ， /^ be the distinct prime ideals in the set {a(P)\deG}. 
Then \G\ = g\G(P)\ where G(P) = {a e G\a(P) = P}. Since efg = n = 
[X:Q] = |G| we see \G(P)\ = ef. Thus, using Proposition 12.3.3 and 
Theorem 3’， Chapter 12 

n <P) = W2 … Pg) ef = (p) f = (P f ), where n Z = pZ. 

<reG 

Since N(P) = | D/P\ = p f , this completes the proof. □ 

Proposition 14.1.3. Let K/Q be Galois with group G. Let (xeD and let A = (a) 
be the principal ideal generated by a. Let Noc be the norm of a. Then N(A )= 

I _ I. 

Proof. (N(A)) = ]^[(j(^4) = ]~[o p ((a)) = ]~[((7a) = (]~[cr(a)) = (iV(oc)). Thus 
N(A) and N(a) differ by a unit. Since they are both in Z they can differ 
only by sign. Since iV ⑷ is，by definition, positive, we have N(A) = | N(a) | 
as asserted. □ 

We remark that the above proposition is true even if K/Q is not a Galois 
extension. The proof in the general case is somewhat more complicated. 


§2 The Power Residue Symbol 

Let m be a positive integer, and denote by D m the ring of integers in Q(C m ). 
Let P be a prime ideal in D m not containing m. Let q = N(P) = | DJP\. By 
Proposition 13.2.3 we know that the cosets of 1， Cm， …， Cm -1 are distinct 
and q = l (m). 

Proposition 14.2.1. Let aeD m , ol$ P. There is an integer i, unique modulo m, 
such that 
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Proof. Since the multiplicative group of DJP has q — 1 elements we have 
a g_1 = 1 (P). Thus 

nV g - 1)/m -a)^o(P). 

i = 0 

Since Pisa prime ideal there is an integer i 9 0 < i < m such that (x iq ~ 1)/m = 
Cm (P). If i ^ j (m) then 匕 笋 Cii (P), so i is unique modulo m. 匚 

Definition. For aeD m and P a prime ideal not containing m, define the 
mth power residue symbol, (a/P) m , as follows: 

(a) (oc/P) m = 0 if a e P. 

(b) If ol.P, (ct/P) m is the unique mth root of unity such that (x {NP ~ l)/m = 
(oi/P) m (P). 

Proposition 14.2.2. 

(a) (a/P) m = l iff x m = (X (P) is solvable in D m . 

(b) For all aeD m , ^ NP ~^ m = (a/P) m (P). 

(c) (ajS/P) m = (a/P) m (i5/P) m . 

(d) If ct = p (P) then (a/P) m = (p/P) m . 

Proof. Since the result has been proven earlier for m = 2, 3, and 4 we may 
safely leave the details to the reader. □ 

Corollary. Suppose P is a prime ideal not containing m. Then 

y(NP- l)/m 

• 

Proof. From part (b) of the proposition, both sides of the above equality 
are congruent modulo P. Since they are both mth roots of unity and m 牵 P ， 
it follows that they are equal. □ 

It is important to extend the definition of (a/F) m in such a way that 
(oi/p) m makes sense when p is prime to m. This is done as follows: 

Definition. Suppose ^ c ： D m is an ideal prime to m. Let A = PiP 2 … 尸 《 be 
the prime decomposition of A. For <xeD m define (oi/A) m = (a/P ( ) m . 

If PeD m and P is prime to m define (a/jS) m = (a/(j8)) m . 

Proposition 14.2.3. Suppose A and B are ideals prime to (m). Then 

(a) (oiP/A) m = (oc/A) m (li/A) m . 

(b) (oc/AB) m = ((x/A) m ((x/B) m . 

(c) If (X is prime to A and x m 三 a (A) is solvable in D m then ((x/A) m = 1. 

Proof. All three assertions are straightforward to prove using the last 
proposition and the above definition. We remark that the converse of part 

(c) is not true. □ 
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We will need to see how the symbol (a/^l) m behaves with respect to auto¬ 
morphisms in the Galois group G of Q(C m )/Q. 

From now on we will use exponential notation for automorphisms. 
If ere G and a e Q(^ m ) we will write instead of ool. Similarly if A is an ideal, 
we will write A a instead of g{A\ This notation is, in fact, more conventional 
and it has certain advantages. 

Proposition 14.2.4. Let A be an ideal prime to m and a g G. Then 



Proof. Since both sides of the asserted equality are multiplicative in A it 
will be enough to check the case where 乂 = P is a prime ideal. By definition 

a( NP- 1)/m 三 g 

Applying a to this congruence we find 

(oc ff ) (N 卜 ” /m 三 (，)• 

It follows that (aVP ff ) m = (a/P)^ (P a ) and so (a7P ff ) w = (a/P)^. Note 
that we have used N(P a ) = N(P). □ 

We end this section by stating the Eisenstein reciprocity law. We need an 
important definition first. 

Let / be an odd prime number. Recall that in D t we have (/) = (1 — Ci) 11 
and (1 — Q is a prime ideal of degree 1. 

Definition. A nonzero element olg D t is called primary if it is not a unit and is 
prime to / and congruent to a rational integer modulo (1 — Cz) 2 - 

In the case / = 3 we demanded a = 2 (1 — C 3 ) 2 so the above definition 
is a bit weaker in this case. It is, however, sufficient for our purposes. The 
following lemma shows that primary elements are plentiful. 

Lemma. Suppose ocg D t and a is prime to /. There is an integer c g Z, unique 
modulo /, such that Cz a is primary. 

Proof. Let A = 1 — C/- Since the prime ideal (A) has degree 1 there is an 
integer aeZ such that on = a (A). Now, (a — d)/XeD l so there is a feeZ 
such that (a — a)/X = b (A). Consequently, ol = a + bk (A 2 ). 

Since 匕 =1 — A we have 三 1 — cA (A 2 ). It follows that 

= a (b — ac)X (A 2 ). 
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The integer a is not divisible by / since otherwise A | a and we are assuming 
a is prime to /. Choose c to be a solution to ax = b (/)• Then = a (A 2 ) 
and so Cf a is primary. 

The uniqueness of c modulo / is clear from the proof. □ 

Theorem 1 (The Eisenstein Reciprocity Law). Let l be an odd prime, as Z 
prime to l ， and ocg D t a primary element. Suppose moreover that a and a are 
prime to each other. Then 



The proof of this elegant theorem will be given in Section 5. It is a conse¬ 
quence of the Stickelberger relation which will be stated in the next section 
and proven in Section 4. Since this process is long, and somewhat involved, 
the reader may wish to skip to the last part of the chapter, Section 6, 
where three interesting applications of Eisenstein reciprocity are given. 
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From the very way they are defined Gauss sums are elements of cyclotomic 
fields. We will investigate the prime ideal decomposition of Gauss sums 
in these fields. 

Let F be a finite field with p f = q elements，x a multiplicative character 
of order m, and i// a nontrivial additive character. Then the values of x are 
mth roots of unity and the values of i// are pth roots of unity. Consequently, 
g ( 乂， x/z) = e Q(C m , C P ). The arithmetic of this field was dealt 

with in the last chapter. 

Before beginning it is necessary to normalize matters by specifying the 
characters x and ij/. This is done as follows. 

Let P be a prime ideal in D m c= Q(( m ) and suppose m 丰 P. Let pZ = 
尸 n Z and N(P) = q = p f • Finally set F — DJP. Recall that p f = l (m). 

We define a multiplicative character Xp on F as follows. Let 0 ^ te F 
and let yeD m be such that y = t. Here y is the residue class of y modulo P. 
Let 

- 1 


By Proposition 14.2.2, Xp(t) is well defined and is a multiplicative character. 
The reason for taking the inverse of the power residue symbol instead of the 
symbol itself will become apparent later. 



Xp(t) 
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For the additive character we choose the character ij/ defined in Chapter 10, 
Section 3. We recall the definition. First one defines tr: F Z/pZ by tr(0 = 
r + P + 〆 + … + t pf ~ \ Then ij/ is defined by \j/(t) = Cp (t) • 

With these choices we define g(P) = g(xp^ •A)- We also define O(P)= 

g(P) m 

Proposition 14.3.1. 

⑷ g(P)GQ(C m ,t： p ). 

(b) \g(P)\ 2 = q. 

(C) (D(P) G Q(C m ). 

Proof, (a) has already been discussed, (b) follows in the same way as when F 
is the prime field, (c) follows from Proposition 8.3.3 which is stated over 
Z/pZ but generalizes easily to F. 

We will give another proof of (c) based on Galois theory. Consider the 
diagram of fields 



The Galois group of Q“ mp )/Q is given by the automorphisms a c where 
(c, pm) = 1. We remark 

(i) g c leaves Q(C m ) element-wise fixed iff c 三 1 (m). 

(ii) g c leaves Q(C P ) element-wise fixed iff c 三 1 (p). 

To show Q>(P)g Q(Cm) ^ will suffice to show 0(F) CTc = O(P) whenever 
c = 1 (m). 

Apply (J c with c 三 l(m) to g(P) = ^ Since i P {t) ac = x P (t) and 

\j/(tY c = \l/(t) c = ij/(ct) we have 

g(pr = I ipitma) = xpicy^py 

Raising both sides to the mth power shows that 0(F) is invariant under a c 
as asserted. □ 

Before proceeding to discuss the prime decomposition of g(P) and ① (F) 
in the general case it is illuminating to review the situation when m = 2, 3, 
and 4. 

When m = 2, Q(^ 2 ) = Q. If p is the positive generator of P we have 

g(P) 2 = (_l)(P-D/2p 

When m = 3, Q(C 3 ) = 0(^—3). Suppose P has degree 1 and P = W 
where n is primary. From the results of Chapter 9, Section 4, we may deduce 
g{P? = O(P) = pn = nn 2 (bar denotes complex conjugation). 
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For m = 4, Q(^ 4 ) = 0(^/ — 1). Suppose P is a prime ideal of degree 1 
and P = (n) where n is primary. From Chapter 9, Section 7, we may deduce 
g(P) 4 = O(P) = pH 2 = 7c 元 3 (again, bar denotes complex conjugation). 

To see the pattern, and to state the generalization a notational device 
known as “symbolic powers” is very useful. Suppose K/Q is a number 
field, Galois over Q, with group G. The group ring Z[G] is defined as the 
set of formal expressions ci((j)(7 where the coefficients a(a) e Z. Later, 
we will show how to make this set into a ring. If a g X we define 

= f] a(oc) a(fT \ 

a 

If A is an ideal we define its symbolic power by an element of the group 
ring in the same way. 

Let o be the nontrivial automorphism of 0(^ —3)/Q. Our result for 
m = 3 takes the form ^>(P) = n 1+2a . 

Similarly if x denotes the nontrivial automorphism of 0(^/ — 1)/Q our 
result is 0( 尸） = 7c 1 + 3r . 

In general we cannot expect a factorization of ① (P) into irreducible 
elements since D m is not always a unique factorization domain. However, 
these special cases generalize beautifully as follows. 

Theorem 2 (The Stickelberger Relation). Let P be a prime ideal in D m not 
containing m. Then 

(<D( 尸 )）= P Ltafl . 

The sum is over all l < t < m which are relatively prime to m. 

The proof of Theorem 2 is long. It will occupy the next section entirely. 


§4 The Proof of the Stickelberger Relation 

We begin with three elementary results which will be needed later. 

Lemma 1. Let p > l be a positive integer. Every positive integer can be 
mitten uniquely in the form Yj=o a iP l ^here 0 < < p. 

Proof. Let a be a positive integer. There is a unique nonnegative integer n 
such that p n < a < p n+1 . By the division algorithm we have a = a n p n + r 
where 0 < r < p n . The number a n is less than p since otherwise a > p n+1 . 
Apply the same process to r, etc. In finitely many steps we have an expression 
for a of the required form. 

The uniqueness can be shown as follows. Suppose ^ 化〆 =[ bip 1 where 
0 < a h b t < p. Then p divides a 0 — b 0 . Since | a 0 — b 0 \ < p we have a 0 — b 0 . 
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Subtract a 0 from both sides, divide by p, and repeat the reasoning. This 
yields a r = b lt In finitely many steps we see a t = b t for all i. □ 

Definition. Let q =〆• If 0 < a 〈分 一 1 write a = YJ=o a iP l with 0 < a t < p 
and define S(a) = [f:。 1 屮. For an arbitrary positive integer a define S(a)= 
S(r) where a = r (q — 1) and 0 < r < — 1. 

Definition. For a real number u define (u) as w — [w] where [u] is the largest 
integer less than or equal to u. The number〈w〉，which is in the interval 
[0, 1)，is called the fractional part of u. 

Lemma 2. S(a) = (p - 1) JJ:。 1 (p^/iq - 1)>. 

Proof. Both sides are unchanged if a multiple of — 1 is added to a. Thus 
we may assume 1 < a < q — 1. 

Write a = a 0 a r p + . • • + where 0 < < p. Since p f = 

q = 1 (q — 1)v/q have 

a = a 0 + … + 々 ―〆 -1 ， 

P a 三 a f-l + a oP + ••• + %_ 2 p’ 1 ( 分 —1 )， 
p 2 a = a f ^ 2 + a f-iP + … + l (Q — l)，etc. 

The right-hand sides of these congruences are all less than ^ — 1 so that 
{p ia /(q — 1)) is equal to the right-hand side of the ith congruence divided 
by q — 1. Thus 

Z (-^t)=S( a)(l + p + • • • + p /_ 
i=o \Q - V ({ - 1 

This yields the lemma. 匚 

Lemma 3. ^ q a =i S(a) = (f(p - l)(q - 2))/2. 

Proof. Write a = a 0 a x p + ... + with 0 < < p. Notice 

that g — 1 = (p — 1) -h (p — l)p -h ••• -h (p — l)〆 -1 . It follows that 
Q-^~ci = (p - l-a 0 )-\-(p-l - a x )p + … + (p — 1 —々—i)〆 -1 
and so 

S(a) + S(q — 1 — a) = f(p — 1). 

Sum both sides from a = 1 to a = q — 2. The result is 2 Yfa=i S(a)= 
fiP - 1)(<? - 2). □ 

The Gauss sum g(P) considered in the last section is an element of 
Q(C m , C p )- The proof of Theorem 2 which we will give requires that we work 
in the bigger field C P ). This has the advantage that all the (q — l)st 

roots of unity can be used freely. On the other hand, more fields means more 
confusion. We will try to minimize the confusion by carefully keeping 
track of which field we are working in. 
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The following diagram will be useful in following the arguments. 


I I I 

I I I 

尸 c — DJP] 



p c= Z — Z/pZJ 


In the above diagram P, ^3, and ^ are prime ideals in the indicated ring 
of integers. Recall from Section 3 that p\m,f is the order of p modulo m, 
so that〆 e 1 (m), and q = p f . For the remainder of this section k p = 1 — C p - 

Lemma 4. 

⑴ ord 淨 (pD (4 _ 1)p ) = p - l. 

( 2 ) ovd^(X p ) = 1 . 

(3) ord^(P) = p - l. 

Proof. To prove (1) apply Proposition 13.2.9 with m (in the notation of 
that proposition) replaced by — 1. Since ^ lies over p it appears in the 
decomposition of pD and one has ord# pD( q _ 1)p = p ~ l. Again by the 
same proposition and Proposition 13.2.7 one has pD p(q - 1} = (pD p )D p(q ^. 1) = 
l D piq . 1} = • • • ^> h ) p -\ where, say, = 沙 、 Hence = 

办 y … 沙 } x and (2) follows. To prove (3) one sees easily using Theorem 2 
of Chapter 13 and Proposition 13.2.9 that PP 2 … P h . D( q _ 1)p = {SPSP 2 - -- 
夕 fi) p 1 where all the primes are distinct and P, P 2 , …， P h are pairwise 
relatively prime. Thus PD iq _ 1)p = and the result follows. □ 

Lemma 5. DJP ^ D q _ 1 /S^. 

Proof. There is a natural monomorphism from DJP to D q - 1 / < ^. To show 
this is an isomorphism it suffices to show both fields have the same number 
of elements. By Theorem 2 of Chapter 13 we have \D q - 1 /Sp \ = p f ' where f 
is the smallest positive integer such that p r = l (q — 1). Since q = p f it 
is clear that f — f and so \D q _ l /^ \ = p f = | DJP |. □ 

By Proposition 13.2.3 we know that the elements 1, have 

distinct images in D q - 1 /^i. The following definition imitates the definition 
of the mth power residue symbol. 

Definition. For aGD^_ x define 
(aj (a， 屯） = 0 if a 6 ^3. 

(b) If (x ^% (a/ 屮） is the unique (q - l)st root of unity such that a = 

(a/^) (n 
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One easily checks that (ogS /平） =(a/^X^/^S) and a = p (^3) implies 
(a/ 平） =(jS/^5). The following lemma is also clear from the definitions. 

Lemma 6. If cue D m , (a/ 屮 )( 4 — 1)/m = (a/P) m . 


We now define a multiplicative character on F ^ l / < ^ as follows 


co(t) 


⑹， 


where yeD q ^. l is such that y = t. The proof that cd is well defined and is a 
multiplicative character is immediate from the previous remarks. 


Lemma 7. co(C q -i) = Q-i- 

Proof. Immediate from the definition. □ 


Consequently, co has order q — 1 and thus generates the group of multi¬ 
plicative characters on F. 

Definition. Let a be a nonnegative integer. Define g a = g(o>_ a ， ip). 

We note that g(P), defined in the last section, is equal to g a for a = 

(q — i)/ w . 

Theorem 2 is a consequence of the following result. 

Theorem 3. ord^(g a ) = S(a\ where 1 < a < q. 

Proof. To begin with we show that ord^ig^ = 1. Recall 

feF 

Using Lemma 7 we will convert this into a sum over the powers of C^-i- 
Let m f be a positive integer such that = tr(Q_i) (p). Also recall that 
C p = 1 -义 P . Then 

gi = Vc^i(i - 、)' 

i = 0 

Using the binomial theorem we see (1 — A p ) mi = 1 — (夕 2 ) and so 

gi = -(1^。 _ 」 1)\( 夕 2 ). 

Now, niiXp = (Q -1 + Cq -1 + … + Cq-\ U )^p (少 2 ). Substituting we find 
dl = — Yj ^q-l(Cq-l + Cf-1 + ••• + Cq-l'^pi^ 2 )- 

i = 0 
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All the sums Yj = o Cq-\ 1] \j = 1, 2,...,/ — 1 are zero while j = 0 gi\es 
the value q — 1. Since q = p f = 0 (淨 2 ) we have 

Gi = K (少 2 ). 

By Lemma 4, part (2), we see but X p 沙 1 . Thus ord^, g 1 = 1. 


Let s(a) — ord^ g a . We will establish a number of properties of the 
function 玄 (a). 

(i) s(a b) < s(a) + s(b) provided 1 < a,b, a b < q — 1. 

By Theorem 1 of Chapter 8 we have g a g b = J(co~ a , co~ b )g a+b . Taking 
ord# of both sides yields the result. 

(ii) s(a b) = s(a) + s(b) (p — 1). 

Notice that the Jacobi sum J(co ~ a , a>~ b ) is in It then follows 

from the fact that ^D (q _ 1)p = ^ {p ~ 1} that p — I divides ord^(J(co _fl , co~ b )). 
The result is thus again an immediate consequence of the relation g a g b = 
J(a)~\o)~ b )g a + b . 

(iii) s(pa) = s(a). 


To see this observe g pa = have 

used the fact that tr(f) = tr(^) which is clear from the definition of trace. 
Now t t p is an automorphism of F. We conclude that g pa = g a and so 
s(pa) = s(a). 

In the first part of the proof we found s(l) = 1. Using (i) and (ii) we see 
s(a) = a iov l < a < p. 

For any a between 1 and q — 1 write a = a 0 + + … + a f - 1 p f ~ 1 , 

0 < a t < p. Using (i) and (iii) we find 

s(a) < x Kdj^) = z = z = s ( a y 

j=o j j 

We now have s(a) < S(a) for all a in the range under consideration. To 
prove the theorem it will be enough, in the light of Lemma 3, to show 


(iv) 


q -2 


Z § ( a ) 




f(p - l)(q - 2) 
2 


In general, for Gauss sums, we have the relation gix 1 ) = x(~^)d(x) 
(here “bar” denotes complex conjugation). Thus g a g q - 1 - a = ci>( — l) a q = 
co(—l) a p f . We know by Lemma 4 that ord^(p) = p — l.lt follows that 


s(a) + s(q - 1 - a) = f(p - 1). 

Sum both sides over a from 1 to q — 2. The result is 2^=1 — 

/(P — 1)(9 — 2). 

This completes the proof of Theorem 3. [ 



214 


14 The Stickelberger Relation and the Eisenstein Reciprocity Law 


Corollary. ord P (0(P)) = (m/(p - 1 ))5((^ - l)/m). 

Proof. Using Lemma 4, part (3 )， we have (p — 1) ord P (❿ (P)) = ovd^(<S>(P)). 
Now, ord 少 ( ① ( 尸 ))=m ord^>(g(P)) = mS((q — l)/m) where the last equality 
follows from the theorem because g(P) = g a with a = (q — l)/m. □ 


This corollary gives the first step in deriving the full prime decomposition 
of ① ( 尸 ) • To go further we first notice that the only prime ideals in D m con¬ 
taining ① ( 尸 ） are those containing p. This follows from parts (b) and (c) of 
Proposition 14.3.1 which show 

|0(P)| 2 = q m = p fm . 

If P’ is another prime ideal of D m containing p then by Proposition 12.3.3 
there is an automorphism o t of Q(C m )/Q such that P f = P af \ For 1 < t < m 
and (t, m) = 1 define P t = P af l . 


Lemma 8. ovd Pt (^>(P)) = (m/(p — — l)/m)). 

Proof. It follows quickly from the definitions that 

ord Ft (0(P)) = ord P ($(P) fft ). 

Choose an integer t' such that t f = t (m) and 〆 三 1 (p). Then 

GiP ) at， = (Z Xp(r)H^)\ = Z ZfWVW- 

V e F / reF 

Thus, we have 

= ( z Xp(r)Hr) 

\reF 

The second term in the above equality is g? where a = t((q — l)/m). 
The proof of the lemma is now concluded by the same reasoning as in the 
corollary to Theorem 3. □ 



We may now, finally, conclude the proof of Theorem 2. 
By the corollary to Theorem 2 of Chapter 13 the group 

G(P)={cjeG(Q(U/Q)\P < ， = P} 


is the cyclic group generated by a p . 

Let q ， t 2 , .. • ， ~ be a set of integers representing the cosets of U(Z/mZ) 
modulo the cyclic subgroup generated by the image of p. In other words, 
if 1 < r < m, (t, m) = 1 then t = (m) for a unique pair \i, j\0<j < f 9 

I < i < g. By Lemma 8 the prime decomposition of ① (P) is given by 




where y’ 




A 


- 1 
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Using Lemma 2 we can write / as follows 



The index i goes from ltog and the index j goes from 0 to/ — 1. Since a p 
leaves P fixed, y f has the same effect on P as 



= Yj where 1 < t < m and (r, m) = 1. 

This concludes the proof. [ 

For future reference we note 

($(/>)) = p md 

where 6 = ^] fmodm (t, m) = 1. The element 0eQ[G] is called 

the Stickelberger element. 
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We will need two results on roots of unity. 

Lemma 1. The only roots of unity in Q(C m ) are ±C m i — 1, 2,..., m. 

Proof In the proof of Theorem 1 we only need this result when m is an odd 
prime. We will leave the proof for general m as an exercise and assume 
m = /, an odd prime. 

Suppose e Q(Q. If 41n then y/— l e Q(C/). However, 2 is ramified in 

0(^/ — 1) and is not ramified in Q(Q. Thus 4 n. If n = 2n 0 , n 0 odd, then 
{Ci} = {±G 0 }，we may assume that n is odd. If /’ is an odd prime dividing 
n then ( r e Q(Ci)- However, I is ramified in Q(Ci) and / is the only prime 
ramified in Q(Q. Thus / = /’ and n must be a power of /, l a say. Since </>(/ fl )= 
l a ~ 1 (l — 1) is the dimension of Q(^«) over Q and / — 1 is the dimension 
of Q(Ci) over Q we must have a = 1. The result follows from this. □ 

Lemma 2. Let K/Q be an algebraic number field and let 〜， d 2 , … ， cr n be the 
n = [X : Q] isomorphisms of K into C. If a e K is such that | a ffi | < 1 for all 
i = 1,2,... ,n then a is a root of unity. 
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Proof, a is a root of 


/W = FI ( x - a<Ti ) G Z M. 

i = 1 

The hypothesis of the lemma implies that the coefficient of x m in / (x) is 
an integer bounded by the binomial coefficient ( 二 ) • Thus only finitely many 
polynomials of degree n is Z[x] can arise in this way. 

If a satisfies the hypothesis of the lemma so do all the powers of a. Since 
finitely many polynomials can have only finitely many roots it iollows that 
two distinct powers of a must be equal. Thus a is a root of unity. □ 

The next step is to define 0(^4) for an arbitrary ideal of D m , A prime to m, 
and to investigate the properties of this function. In particular, it will be 
important to determine O on principal ideals. 

Definition. Let ^4 c= D m be an ideal and assume A is prime to m. Let A = 
PiP 2 •• 尸 „ be the prime decomposition of A. Define 

①⑷ = ①(尸 冰 尸 2 ) … 0)(0 

Proposition 14.5.1. Let A, B a D m be ideals prime to m, oceD m an element 
prime to m, and recall y = ^ 1 1 < t < m and (t, m) = 1. Then 

(a) 0> ⑷ 0(B) = ^(AB). 

(b) I ①⑷ I 2 = (NA) m . 

(c) = Ay 

⑹ 0( ⑻） = : e(oc)(x y where e(oc) is a unit in D m . 

Proof, (a) is clear from the definition. 

Since both sides of (b) are multiplicative in A we can assume A is a prime 
ideal P. In that case |0(P)| 2 = \g{P)\ 2m = (NP) m by Proposition 14.3.1, 
part (b). 

Both sides of (c) are multiplicative in A so again we may suppose 乂 is a 
prime ideal P. In this case the result is the assertion of Theorem 2. 

To do part (d) notice 


(0((a))) = (a) y = (a y ) 


by part (c). Thus ① ((a)) and oc y generate the same principal ideal. □ 


From now on we will write O(a) instead of ① ((a)). 

It will be important to determine the unit e(a) more closely. In fact we 
will show it is a root of unity. 

Lemma 3. Suppose A a D m is an ideal prime to m and let a be an automorphism 
of Q(Cm)/Q- Then 


^{A) a = 0>(A a ). 
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Proof. To see this it is convenient to write g(P) in the following form 

狀卜 z (|): V )， 

where the sum is over a set of representatives for the cosets of DJP. 

Let a be an automorphism of Q(C m ， C p )/Q which restricts to a on Q(C m ) 
and the identity on Q(C P ) (see the proof to Lemma 8). By Proposition 
14.2.4 we have 

Since tr(a) e Z/pZ we have tr(a CT ) = tr ⑹ • It follows that g(P) d = g{P a )- 
Raising both sides to the mth power gives the result when Z is a prime ideal. 
By multiplicativity the result follows in general. □ 

Lemma 4. For oce D m , \ oc y \ 2 = \Na\ m . 

Proof. The automorphism is complex conjugation on Q(C m ) since it 
takes Cm to Cm 1 = Cm- Thus 

|a y | 2 = 

Now, = cr.j Yj ^t ~ 1 = Yj t(7 -t' Clearly, a m _ t = d _ t ，and y = 

~ Thus, using r = m — (m — t) we find 

(1 + (7-Jy = mK. 

Since Noc = a CTr 1 = a I<Tt_l the result follows. [ 

Proposition 14.5.2. Let (xeD m , a prime to m. Then ① (a) = e(a)a v where 
e(oc) = 土 C m for some i. 

Proof. In the light of part (d) of the last proposition it is enough to prove the 
assertion about a(a). We have |<X>(a)| 2 = (N((x)) m by Proposition 14.5.1 and 
|a v | 2 = I Noc\ m by Lemma 4. By Proposition 14.1.3, N(a) = \Nol\. 

Putting all this together we conclude that |e(a)| = 1. Using Lemma 3 
we find in the same way that | £(a) CT | = 1 for all aeG.lt now follows from 
Lemma 2 that a(a) is a root of unity. Finally since a(a) g Q(C m ) we have 
e(a) = 土 C l m by Lemma 1. □ 

We are now in a position to begin the proof of the Eisenstein reciprocity 
law. The pattern of proof of the following proposition should be familiar 
from our proofs of quadratic, cubic, and biquadratic reciprocity. It is itself 
a “reciprocity” statement. 

Proposition 14.5.3. Suppose P, P f cz D m are prime ideals both prime to m. 
Suppose further that NP and NP’ are relatively prime. Then 

卽 ) 、 /NP f \ 
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Proof. Let q f = p ,f， = NP\ Recall 分’三 1 (m). The following congruences 
are taken modulo p f in D m 

g(P) q， = Z Xp(t) q ， il/(t ) q， 

=z Xp(t) 构 , t) 

三 ㈤, )• 

On the other hand 

g(P ) q， - 1 - ^ ( 誓 ) （ p'). 

It follows that 



Since m 丰 P’ the two sides of this congruence must be equal. [ 


Corollary 1. Suppose A, B a D m are ideals prime to m and that NA and NB 
are prime to each other. Then 

\A )m \ B ) m 

Proof. As usual, the corollary follows from the proposition by multi - 
plicativity. □ 


Corollary 2. Suppose A and B are as in Corollary 1 and moreover that A = (a) 
is principal. Then 

\ B ) m \NB) m \ a ) m 

Proof. To begin with 

( ①⑷ 、 = ( £(a)\ /oc^\ 

Notice that (a tfft ~ l /B) m = (a af 1 /B) l m = (oc af l /B)^ = (oc/B at ) m by Proposi¬ 
tion 14.2.4. Thus 



To obtain the final equality we have used Proposition 14.1.2. [ 

From now on we will assume m = /, an odd prime number. 

Lemma 5. If A a D t is an ideal prime to /, then ① 04 ) 三士 1 (/)• 
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Proof. It is enough to show that O(P) = —1(/) where P cz D t is a, prime 
ideal prime to /. Well, 

O(P) = g(py ^ X X p(t) l Ht) 1 (0 

t 

=Yj 0(") E — 1 (/)• 

r 垆 o 

The last congruence follows from the fact that / is a nontrivial 

additive character on DJP and so the sum of its values over all t is zero. 
Since 0(0) = 1 the result follows. 匚 

Recall that oceD is called primary if a is prime to / and a = x (1 — Q 2 , 
for some xeZ. 

Lemma 6. If a e D is primary, then s(a) = 土 1. 

Proof. Since (1 — is the unique prime above / in D t we have (1 — CiT = 
(1 — C z ) for all aeG.lt follows that (1 — cz (1 — Q. 

Since 0(a) = s(oc)(x y we have by Lemma 5 that e(a)a y 三土 1 (/)• 

Since a = x (1 — Ci) 2 with xeZ we find 

a y = x y = x i + 2 + … + (/-i)(i — q2. 

Now, x(’ — 1)/2 e 土 1 (/), so 

v 三（土 l) f 三 ±1(1 — c,) 2 . 

It follows that s(a) = 土 1 (1 — Q 2 . From Proposition 14.5.2 we know 
e(a) = 土 （ i. To conclude the proof we must show that / divides i. This 
follows from the uniqueness part of the lemma in Section 2, but it may be 
worthwhile to do it directly. 

We have t^\ = 土 1 (1 — Q 2 * Writing Q = 1 — （1 一 O we find 

1 — z(l — Ci) 三 ± 1 (1 — Ci) 2 - 

The plus sign must hold since otherwise 1—(/ would divide 2. But then, 
subtracting 1 from both sides, we see 1-(/ divides i which implies l\i. □ 

Proposition 14.5.4. If a eD t is primary, and B is an ideal prime to /, and NB 
is prime to a, then 

(n 

Proof. By Corollary 2 to Proposition 14.5.3 we need only show (e(a)/B) l = 1. 

Since a is primary e(cc) = 土 1 by the above lemma. Since / is odd, (土 1)’ = 
土 1 and we are done. □ 


We can now complete the proof of Theorem 1. 
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Let p e Z be a prime, p # /, and p prime to a in Di ，Let P be a prime ideal 
in Di containing p. Then NP = p f . In the proposition we have just proven 
we substitute P for B. The result is 



Since f\l— 1 = [Q(C/) : Q] we have (/, 1) = \. Thus 



From this and (one last time) multiplicativity, we deduce (oc/a)i = (a/a), 
for all a g Z prime to / and a, provided a is primary. □ 


§6 Three Applications 

In Chapter 5 we proved that if a is an integer such that x 2 三 a (p) is solvable 
for all but finitely many primes then a is a square. This has been generalized 
to nth powers by E. Trost. The result was later rediscovered by N. C. Ankeny 
and C A. Rogers. The result states that if x n = a (p) for all but finitely many 
primes p then a = b n if S J^n and a = b n or a = 2 n/2 b n if 81 n. Using Eisenstein 
reciprocity we will prove a portion of this when n = / an odd prime. See 
also [211], [134] and the Notes to Chapter 5. 

Theorem 4. Suppose a g Z and that l X a where I is an odd prime. If x l = a (p) 
is solvable for all but finitely many primes p then a = b l . 

Proof. We can restate the theorem as follows. If a is not an Zth power then 
there are infinitely many primes p such that x l = a (p) is not solvable. 

Assume a is not an /th power in Z. Let aD t = P a x x Pf ... be the prime 
decomposition of a in D t . We claim that l a { for at least one To see 
this, let PiZ = P ( n Z. Since IJ^a v/q have / / p t and so p, is unramified in 
Di ，Consequently ord p . a = ord P . a = a { . If l\a { for all i it would follow that 
a is an /th power in Z. We may thus assume l a n . 

Let {Q u Q 2 ,..., Q*} be a finite set of primes Q ( different from the P t and 
from (1 — Ci). 

Using the Chinese Remainder Theorem we can find an element t eD z 
such that t 三 1 (Q t ) for i = 1, 2,..., /c, t = 1 (/), t = 1 (Pj) for j = 1,2,..., 
n — 1, and t = a (P n ) where a is chosen so that (oc/P n ) l = 

Since t = 1 (/)， t is primary. Thus, on the one hand 
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On the other hand, let (t) = R t R 2 - • R m be the prime decomposition 
of t. Then 



It follows that for some 乂 (a/Rj)i / 1. 

From the congruences that t satisfies it follows immediately that Rj^ 

必，么， … ， aj ^ {(i — ^ {匕 … ， 户 ”}. 

We have shown that there are infinitely many prime ideals Q such that 
x l = a (Q) is not solvable. Let qZ = Q n Z. Then x l = a (q) is not solvable 
and there are infinitely many such q since every rational prime is contained 
in only finitely many prime ideals in D t . □ 

The second application of Eisenstein reciprocity we wish to make is to 
Fermat’s conjecture. This states that if n > 2 is an integer there is no solution 
to + /* + z” = 0 in non-zero integers. The fascinating history of this 
conjecture will be sketched in a later chapter. 

It is easy to see that if Fermat’s conjecture is true for n then it will be true 
for any multiple of n. Since any integer bigger than 2 is either divisible by 4 
or by an odd prime we may restrict our attention to the cases n = 4 or 
n = / an odd prime. The case n = 4 was settled, affirmatively, by L. Euler. 

When / is an odd prime it is traditional to consider two cases. We say 
we are in case one if x l y l -{■ z l — 0 and / 木 xyz. Otherwise we are in case 
two. In 1909 A. Wieferich published the following important result ([166], 
Vol. 3). 

Theorem 5. If x l -{■ y l z l = 0 is solvable in non-zero integers such that 
l xyz then 2 卜 1 三 1 (l 2 ). 

It has been shown that the only two primes less than 3 x 10 9 which 
satisfy 2 卜 1 三 1 (/ 2 ) are 1093 and 3511. It is not known if there are infinitely 
many primes of this type. 

In 1912 Furtwangler proved a theorem which contains Theorem 5 as a 
corollary. Namely, 

Theorem 6. Let x, y, and z be non-zero integers，relatively prime in pairs，such 
that x l + y l z l = 0. Assume l)( yz. Let p be a prime factor of y. Then 
p l ~ l = 1 (l 2 ). 

It is a simple exercise to see that the condition that x, y, and z be relatively 
prime in pairs is no loss of generality. 

To see how Theorem 5 follows from Theorem 6, assume / 氺 xyz. Since 
x l + y l + z l = 0 not all three numbers x, y, and z can be odd. By symmetry 
vve can assume 2\y. By Theorem 6 we have 2 卜 1 三 1 (Z 2 ). 
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We proceed to prove Furtwangler’s theorem. Let C = Cz be a primitive 
Ith root of unity. We have 

(x + y)(x + 0；) ••• (x + ( 卜 V) = (- z)’. （*) 

Lemma 1. Suppose i # j and 0 < z, j < l. Then x 4- Qy and x -f C j y are 
relatively prime in D t . 

Proof. Suppose A a D t is an ideal containing x -I- Cy and x H- C j y- Then 
— ^)x and — C)y are in A. Since x and y are relatively prime it follows 
that — C is in A. It follows that A = 1 — t^e A. Since (A) is a maximal 
ideal, either (X) = A or A = D t . If (A) = A, then from equation (*) we see 
(—z)e(A) which implies ze(A) and l\z, contrary to assumption. Thus 
A = D t and we are done. □ 

Corollary. The ideals (x + ( l y) are perfect Ith powers. 

Consider the element a = (x + y) l ~ 2 (x + Cy)- We claim 

(i) The ideal (a) is a perfect Ith power. 

(ii) a = l — uX (义 2 ) where u = (x + y) l ~ 2 y> 

Property (i) follows from the corollary to the lemma. 

To prove property (ii) notice x + = x y — yX. Thus, 

a = (x + y) l ~ l — hx. 

Now, x l + y l + z l = x y + z (/). If l\(x + y) it would follow that l\z 
contrary to assumption. Therefore l J^(x + y) and (x + ^) z_1 = 1 (/). 
Property (ii) follows. 

Consider C _M a. We have 

C -M a = (1 — X)~ u ol = (1 H- «A)(1 — wA) = 1 (A 2 ). 

It follows that C _u a is primary. By Eisenstein reciprocity we have 



Since the ideal (C -U a) = (a) is an Ith power, the left-hand side of (**) is 
equal to 1. 

Since p\y, a = (x + y) l ~ x (p). Thus 

(A = ((^±yy^ = ( p ) =1 

\p)i [ p )i~\(x + y y- l ) l ~ 5 

because the ideal (x + y) is an Zth power. 

It now follows from (**) that (C/p) 1 ! = 1. To conclude the proof we must 
evaluate (C/p),. 

Let pD t = PiP 2 … P g be the prime decomposition of p in D t . We know 
NPi = and, since p ^ I e = l, and so qf = l — l. 
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By the corollary to Proposition 14.2.2 







The relation (C/p)i = 1 now leads to the congruence 


〆 _ 


0 ( 0 - 


Since g\l — 1 ,1 J(g. Since u = (x + y) l ~ 2 y, / 氺 w. Thus 

— 1 

——-— = 0 ( 1 ) or 〆 三 1 (Z 2 ). 


The theorem is now immediate since f\l — 1. 


We conclude with an application of Theorem 2 which concerns the struc¬ 
ture of the ideal class group of 0(^/^/) where / > 3 is prime / 三 3 (4). Let 
p be an odd prime p = 1(/). Then since p splits completely in Q(Ci) it also 

splits in (why?). One can also see this by observing ( — l/p )= 

(—l) (p ~ 1)/2 (/7//)(— l) ((p-1)/2)((i-1)/2) = (p/0 = 1 and applying Proposition 

13.1.3. In the ring of integers D of write p = 屮取 If 5 denotes the 

ring of integers in Q(Ci) we have 


Lemma 2. ^55 where P is a prime ideal ofD,PnD = % and s runs 

over the nonzero squares modulo l 

Proof. The set of <x s in the statement of the lemma form the Galois group of 

Q(Ci) over Since pD = =P 1 !: 卜 and a n ( 屯 ） =^5 for a nonsquare n 

modulo / it follows that is divisible by precisely the (T S (P), each with 
exponent 1. □ 

By Theorem 2 we have (g(P) 1 ) — P Ltaf \ t = 1, 2, … ， / : 1. Applying 
^(7 S , s a square modulo / gives (a)5 = . B = - - D where 

(xeD and n runs over the nonsquares modulo l in the interval [1, l — 1]. 
Put R = s, iV = ^ By Exercise 34 of Chapter 12 it follows that olD = 
屮及采 ' If [21] denotes the equivalence class of the ideal 21 and 1 is the unit 
class then PP] -1 = [ 奶 . Thus [ 刺 = 1. On the other hand if 1 < r < 

/ — 1 by Exercise 6 (or Lemma 3, Section 3, Chapter 15), one has g(PY r ~ r = P 
for some P e D. Raising to the /th power, using Theorem 2, and applying 
2 a s gives, for r a square 爹 1, ( 屯只屯 〜) 1 — = (y) l D for some y E D. It 
follows that ([ 刺尺 )" 广 1 = 1 (it is easy to show l\N and l\R). But from 
the above ( 刚 (N-/0// )’ = 1. Since (r - 1, /) = 1 we have proven the 
following result. 

Proposition 14.6.1. Let ^ be a prime ideal of degree 1 in 0(^/^/) for l ^ 3 
a prime such that I = 3 (4). Then, [ 刺 = 1. 
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While it is elementary that (N — R)/l is an integer it is by no means 
obvious that it is positive. All known proofs of this fact use analysis. We 
will give a short proof due to Moser in the Exercises to Chapter 16. For 
other proofs of the positivity as well as many other interesting results of 
this type see the paper by B. Berndt [94]. It turns out, as mentioned in Chapter 

13 that (N — R)/l is indeed the class number of 0(^/^]) but again the 
proof is analytic. When, by direct calculation N — R = l it follows that ^3 
is a principal ideal. If one assumes, as can be shown, that each ideal class 
contains a prime ideal of degree 1 then one can conclude that for such /, 

Q( x / r ^/) is a unique factorization domain. In this manner one checks that 
the imaginary quadratic fields with discriminant —7, —11, — 19, —43, 
— 67, —163 all have class number 1. Referring again to the proposition 

^P (iV — K) " = (a) where a = (x H- yj^-[y)j2\ x, ye Z. Taking norms gives the 
following interesting corollary. 

Corollary. If p = 1(1 )， l 三 3(4)，/ > 3, then 4p {N ~ R)/l = x 2 -h ly 2 , with x, yeZ. 


Notes 

In his paper “Uber eine Verallgemeinerung der Kreistheilung” (1890) [224], 
the Swiss mathematician Ludwig Stickelberger (1850-1936) (see [148]) 
succeeded in determining the prime decomposition of a Gauss sum attached 
to an arbitrary multiplicative character defined on a finite field (Theorem 2 
of this chapter). Actually he proved a more precise result. Namely, using the 
notation of this chapter 



—( —又广 


a 0 \ a x ! 


• •參 


a f — 


(^ S(fl)+1 ). 


/-i 


This, of course, implies Theorem 2. The special case of this theorem when m 
is prime and p = 1 (m) had already been proven by Kummer in 1847. It 
is interesting to note that Kummer derived the result by first determining 
the decomposition in Q(( m ) of certain Jacobi sums, which in turn was made 
possible by the congruence J(a5 m , aj n ) = — [(m -h n)l/nl m!](P), known to 
Jacobi, Eisenstein and Cauchy. (See Kummer [164], Vol. 1, pp. 361-364, 
pp. 448-453, and Exercises 1 and 2). An elegant proof of Kummer’s result 
can also be found in Hilbert’s “Zahlbericht” [151] (Theorem 135), where 
the use of Jacobi sums is avoided by using an argument involving ramifi¬ 
cation. This special case of Stickelberger’s theorem was the missing link in 
the program initiated by Gauss, Eisenstein and Jacobi to establish higher 
power reciprocity laws. Indeed, in 1850 [132] Eisenstein published his 
proof of the reciprocity law bearing his name (Theorem 1), making use of the 
then relatively new language of ideal numbers due to Kummer. A complete 
proof can be found also in Vol. 3 of Landau [166] as well as in Hilbert’s 
“Zahlbericht” (Theorem 140), where in order to overcome the restriction 
that p = 1(1) he uses the finiteness of the class number for Q(^)! Hilbert 
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views the Eisenstein law as an indispensible lemma for the Kummer reci¬ 
procity law. The proof of Theorem 2 that we have given follows that found 
in the important paper by Hasse and Davenport [23] (see also Chapter 7 
of Joly [160]), while the derivation of Eisenstein^ law from Kummer s 
Theorem closely follows the treatment in Weil’s elegant historical study 
“La cyclotomie jadis et naguere” [238]. This paper of Weil along with his 
review of Eisenstein^ “ Mathematische Werke” [239] and his introduction 
to the collected papers of Kummer [164] provide a detailed and insightful 
history of the efforts of Jacobi, Eisenstein, and Kummer to prove higher 
power reciprocity laws with the use of Gauss sums. In this text we have 
followed this development up to the work of Eisenstein. The subsequent 
development leads to the research of Kummer, Hilbert, Furtwangler, and 
Takagi, and eventually, to the celebrated Artin law of reciprocity. For the 
history of these developments see Iyanaga [158] and Hasse [110]. For an 
interesting, and perhaps more elementary, discussion of the nature of 
reciprocity laws see Wyman’s paper “What is a reciprocity law?” [246]. 

The use of Theorem 2 to show that the ideal class group of 
/ 三 3(4) is annihilated by (1/0 Y!x~=\ goes back to Kummer and 

appears as Theorem 145 of Hilbert’s “Zahlbericht”. The corollary to 
Proposition 14.6.1 was originally observed by Jacobi who, on its basis, 

conjectured the class number formula for 0(^/--/). (See also the comment 
of Weil [238], pp. 252-253.) Stickelberger, in the above-mentioned paper, 
returns to this application of cyclotomy to the arithmetic of quadratic 

forms and obtains similar results for Q(>/—m), for general m. 

There are other applications of Theorem 1 to Fermat’s Last Theorem. 
For example, a well-known result of Mirimanoff states that if x, y, and z 
are integers such that x p + y p + z p = 0, p^xyz then 3 P_1 三 1 (p 2 ) (see 
Theorem 1041, Landau [166]). Also Vandiver has shown, using similar 
methods, that if x p + y p + z p = 0, (x, y, z) = 1, p > 3 then x p = x (p 3 ), 
y p = y O? 3 ), z 3 = z (p 3 ) (Landau [166], Theorem 1046). For further results 
on Fermat’s Last Theorem that utilize Eisenstein reciprocity see Lecture 9 
of the beautiful book by P. Ribenboim 13 Lectures on Fermat's Last Theorem 
[206]. 


Exercises 

Throughout these exercises the notation is as in this chapter. 

1. Show that if 1 < n < q - 1,1 < m < q — \ then 

(a) J(cD~ n ,w~ m ) — — [(m + n)\/n\m\^ (^5) 

(b) If 1 < a < q ~ a = a 0 + a r p + … + 々_!〆— 1 then J(co~ l , co~ {a ~ 1) )= 
—a 0 ⑼). 

2. In the proof to Theorem 2 we showed that g x = X p (少 2 ). 

(a) If \ < a < p — \ show g a = ( — l) fl+ l X a p /a \ (^ a+1 ) where a = (^ n ) means 

ord 多 (a - p)>n. 
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(b) If the Stickelberger congruence 〜三 （一 ⑷ / 口。！〜 ！ … ^ ( 淨 1 +s ⑷) 
holds for some \ < a < q — \ and pa < q — \ then show it also holds for g pa . 

(c) Establish the general Stickelberger congruence. 

3. Show that if m > 2 then g(P)p~ 1/2 is not an algebraic integer (see also Chowla 

[113]). ' 

4. Let r and s be positive integers m 木 r + s. Show that (J(x r p, Xp)) = where a = 
[( 〈 "/m> + {st/m} — <[(r + s)t~\lniy)o~ 1 the sum being over f, 1 < r < m, 
(r, m) = 1. 

5. Check that the argument in Section 4 showing g x = X p ( 少 2 ) is valid for p = 2, 
m odd. 

6. If (r, pm) = 1, 1 < r < pm, then g(P) ar _ r e Q(C m ). 

7. Verify Lemma 1 of Section 5. 

8. Let p 三 1 (m), where m is prime. Without using Exercise 4 show that J(x ， X k \ 
\<k<p — 2 is a. product of distinct prime ideals each with exponent 1. Use 
Exercise 1 to determine the decomposition of J(x ， X k ) and use Proposition 8.3.3 
to give a direct proof, in this case, of Theorem 2 (Kummer). 

9. Let K/F be a Galois extension with cyclic Galois group of order p and generator o. 
Define, for x g X,/(x) = 1 + x 4- xo-(x) + ... + xa(x) - - - a p ~ 3 (x). Let p be prime, 
F= QiCp-iX K = Q(C P , C P -i). Show that g( X ) = x(x)C x p = C p f(C p .^ P ~^ 
where t is a primitive modulo p, x(t) = C p -1 and o is the automorphism of K/F 
for which (j(C p - i) = C p -1 and cr(C p ) = C P - Conclude that the Gauss sum is the great 
grandfather of cohomology theory (Kummer [164], p. 10). 

10. Use Theorem 2 to show that Q(g(P) m ) is the fixed field of the decomposition group 
of p, also known as the decomposition field of p. 

11. For a prime / and positive integers r, s, and t satisfying r + s + t = l put H r s ,= 
{h\h e F*, lir + fis fit = 1} where a denotes the smallest nonnegative residue of a 
modulo /. Show that H r s t is a set of coset representatives for the subgroup of order 
2 in F*. 

12. Consider the curve T over F p defined by y l = x r (l — x) s , the notation being as in 
Exercise 11. 

(a) Show that the zeta function of T can be written z(u) = g(u)/(\ ~ u)(l — pu\ 
where = J^ p (1 + J(x r P , XpW) where P ranges over the prime ideals in Q(C/) 
over p and the notation is as in the text; i.e., Yp ls the /th power residue symbol. 

(b) Show that (J/p, Xp)) = P^\ where k g H r s 

(c) Show that if the order of p modulo / (i.e.，/) is even, then complex conjugation is in 
the decomposition group of P. 

(d) If/is even, (J ( 办，苏 ) ） = (p m ). 

(e) J(x r p^ Xp) = w〆’ 2 , where u is an /th root of unity. 

(f) Show that u — 1. 

(Exercises 11 and 12 are from a paper by B. H. Gross and D. E. Rohrlich [142].) 

13. Let / be prime, / ^ £ a multiplicative character of F t . Put B y = (1//) Y!a~=\ a X( a )' 
Consider the elements of the group ring of the Galois group G of Q(C/)/Q with 
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coefficients in Q(C Z ) defined by s x = (1/( / - l))Xi:\ x{a)~ l a a , 0 = 
where cr fl (C/) = Cf- Show 

(a) f. x (l,i)8 x - i(g) = /( —1)(/ — l) -2 /. 

(b) — / = (1 — C/)(C/ + 2C/ 2 + ••• + (/ — 1)C! J )- 

(c) 〜 (-C,A1 - c,)) = 

(d) 0e x = —B x -ie x , where one defines (X!=i ^t^t) = Yj=i c t G t with 

C t = ^juv = t(l) a uK^ 1 ^ w, i; < L 

(This exercise is taken from Iwasawa [157], pp. 115-117.) 

14. Let p and / be prime, / > 3.1fp # / and a g Z[C z ] is real, (a, /) = 1 show that (a/p)/ = 1. 

15. Let p # / be primes, / > 3. Show 

(a) (Ci/p)i = C! (/ l)lf)((pf ~ 1)/0 , where/is the order of p modulo /. 

(b) (C//p), = 1 implies p /_ 1 = 1 (/ 2 ). 

16. Read Satz 1039 and Satz 1041 in Landau [166], Vol. 3. 

17. Let m = /, an odd prime, and let if be a prime ideal in Q(f p/ ) containing (1 — C/)- 
Show 

(a) g(P)^ -1(1 -C/). 

(b) g(P) = - 1 + c(l - C/) (J^ 2 ) with c g Z[f p ]. 

(c) (-g(P)) at = (-g(P)y (^ 2 ) for (t, l) = 1 and a, the automorphism of Q(C p i) 
such that (T t (C p ) = C P and cr f (C/) = Ci- 

(d) 分 ( 尸广 — iy m). 

(e) If 1 <«,/?< /, l%a + b then J(x a p, Xp) = — 1 (1 — C/) 2 . 

This exercise is taken from Iwasawa [156]. 



Chapter 15 

Bernoulli Numbers 


In this chapter we will introduce an important sequence of 
rational numbers discovered by Jacob Bernoulli (1654- 
1705) and discussed by him in a posthumous work Ars 
Conjectandi (1713). These numbers，now called Bernoulli 
numbers, appear in many different areas of mathematics. 
In the first section we give their definition and discuss their 
connection with three different classical problems. In the 
next section we discuss various arithmetical properties of 
Bernoulli numbers including the Claussen—von Staudt 
theorem and the Kummer congruences. The first of these 
results determines the denominators of the Bernoulli 
numbers, and the second gives information about their 
numerators. In the last section we prove a theorem due to 
J. Herbrand which relates Bernoulli numbers to the 
structure of the ideal class group of Q(( p ). The material 
in this section is somewhat sophisticated but we have in¬ 
cluded it anyway because it provides a beautiful and 
important application of the Stickelberger relation which 
was proven in the last chapter. 


§1 Bernoulli Numbers; Definitions and Applications 


We begin by discussing three problems, each of historic interest. 

The first concerns finding formulas for summing the kth powers of the 
first n integers. Jacob Bernoulli was aware of the following facts 


1 + 2 + 3 + .•. + (?! — 1) 


n(n — 1) 




2 + 2 2 3 + 3 2 + … + (n - l) 2 


n(n — l)(2n — 1) 




6 


l 3 + 2 3 + 3 3 + •.. + (n — 1) 


3 n 2 (n — 1) 


2 


4 


as well as corresponding, less well known, formulas for exponents up to 10. 
For each exponent k the sum l k + 2 k + - • + (n — \) k turned out to be a 
polynomial in n of degree k 4 - 1. In his efforts to determine the coefficients 
of these polynomials for general k, Bernoulli was led to define the numbers 
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and more generally where C(s) = n~ s is the Riemann zeta 

function. After long effort L. Euler showed in 1734 that ((2) = n 2 /6. 
Subsequently he determined ((2m) for all positive integers m. 

The third problem is the celebrated Fermat’s Last Theorem. If n is an 
integer greater than 2, Fermat asserted that x n + y n = z n has no solution 
in positive integers. This assertion has never been proved in general. It is 
easily seen that the conjecture is true if it is true whenever n = p, an odd 
prime. In 1847 E. Kummer proved the conjecture is true for a certain set of 
primes called regular primes. A prime p is called regular if it does not divide 
the class number of Q(C P ). Furthermore, Kummer discovered a beautiful 
and elementary criterion for regularity which involves divisibility properties 
of the first (p — 3)/2 nonvanishing Bernoulli numbers. 

We will discuss these three problems in turn. 

Define S m (n) = l m + 2 m + … + (n — l) m . We first give a simple in¬ 
ductive method for evaluating these sums. The binomial theorem implies 


(k + l) m+1 - k m + 


m 




+ 


m 




m 


Substitute /c = 0, 1 ， 2,…， n — 1 and add. The result is 


n' 


：+ 


m 


n 


Pi ⑻ + 


m 


2 


\S 2 (^) + … + 


m 


m 


( 1 ) 


If one has formulas for S^n), S 2 (n), ... ， 5^—00 then Equation (1) 
allows one to find a formula for S m (n). Bernoulli observed that S m (n) is a 
polynomial of degree m + 1 in n with leading term n m+1 /m + 1. This follows 
easily by induction from Equation (1). Also, the constant term is always 
zero. The value of the other coefficients is less obvious. By direct computa¬ 


tion one finds the coefficient of n to be —j, L 0, 


30, 


0, A, 0, —jo, 0, 


_5_ 

66 


for m = 1, 2, ..., 10. Further empirical observation of the formulas led 
Bernoulli to the following definition and theorem. 


Definition. The sequence of numbers B 0 , B u B 2 ,. 
ire defined inductively as follows. B 0 = 1 and if B x 
ietermined then B m is defined by 


.• ， the Bernoulli numbers, 
,B 2 ,. •. are already 


(m + \)B m =—[ 


m 


Bk- 


( 2 ) 


which bear his name. He was completely successful in answering the original 
problem and proudly remarks (in his book Ars Conjectandi) that in less 
than a half of a quarter of an hour he was able to sum the tenth powers of 
the first thousand integers [220]. 

Another outstanding problem of that period was to evaluate the sum 
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Written out this becomes the sequence of linear equations 

1 + = 0 

1 4 - 3Bi + 3B 2 = 0 

1 + + 6^2 ~h 4 B 3 == 0 

1 4- 5^! 4- 10B 2 + 10B 3 + 5B 4 = 0. 

One finds B! = —= 0, B 4 = — 士， B 5 = 0,B 6 = 4 ^, …， etc. 
We shall prove later that the nonzero Bernoulli numbers alternate in sign. 
Furthermore we shall see that the Bernoulli numbers with odd index bigger 
than 1 vanish. 


Lemma 1. Expand tl{e l — 1) in a power series about the origin as follows 
t/(e l — 1) = Y^m = o 0- Then for all m ， b m = B m . 

Proof. Multiply both sides by d — 1 to obtain 



Equating coefficients of t m+1 

m 

z 

k = 0 


gives 1 = b 0 for m = 0 and 

O = 0 


in general. This is the same as the system of Equation (2) which defines the 
Bernoulli numbers. Since B 0 = b 0 = 1 it follows that B m = b m for all m. □ 


We now give the answer obtained by Bernoulli to the question of eval¬ 
uating the sums S m (n). 

Theorem 1. For m > 1 the sums S m (n) satisfy 

(m + ^ + k {) jB k n m+1 - k . 

Proof. In e kt = [ 二 =0 k m (t m /m\) substitute fc = 0 ， 1 ， 2, …， n — 1 and add. 
This results in 


1 + + … + e {n ~ 1)r = 


乙心⑻ 


o 


r 


m 


The left-hand side is 

- 1 # 


e 


e 


00 广 fc-1 00 fj 


(3) 


(3 ，） 


Equating the coefficients of t m on the right-hand sides of Equations (3) 
and (3》and multiplying by (m + 1)! gives the result. □ 
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We may reformulate the result of Theorem 1 by introducing an important 
class of polynomials known as Bernoulli polynomials. Define 

m / m \ 

^(x)= ^i k JB k x m - k . 

Thus B x (x) = x — B 2 (x) = x 2 — x + etc. Then Theorem 1 may be 

stated as 


m + 1 

We remark in passing that Lemma 1 yields an easy proof that B 2k+1 = 0 
for fc > 1. Since B { = —j we have 



+ 


2 


+ ZB k 


k = 2 


亡 

kl 


The left-hand side is the same as (t/2)((e f + 1)/(〆 一 1)) which is un¬ 
changed if t is replaced by — t, i.e., it is an even function of t. This implies the 
coefficients of odd powers of t on the right-hand side are zero. 

We now turn to the relationship between Bernoulli numbers and the 
numbers C(2m) for m = 1, 2, 3,.... The following result is due to Euler and 
constitutes one of his most remarkable calculations. For the history of this 
result and its relation to the functional equation of the Riemann zeta function 
the reader should consult the article of Raymond Ayoub [88]. 


Theorem 2. For m a positive integer 

2C(2rn) = (-ir +1 ^^B 2m . 

Proof. The proof of this result requires a fact from classical analysis. Namely, 
we need the partial fraction expansion for cot x. 


cot x 


x 


2 1 


x 


n 2 n 2 


x 


2 


⑷ 


There are several ways to derive this expansion. Perhaps the simplest 
way is to substitute t = 1 in the Fourier expansion of cos cat. Alternatively, 
the result follows from taking the logarithmic derivative of the infinite 
product expansion of sin x 


sin x = 




This is a standard result in texts on complex variables but it is possible 
to give a completely elementary proof (see Chapter 2 of Koblitz [162]). 
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Using the formula for the geometric series we can expand the right-hand 
side of (4) in a power series about 0. This yields 


x cot x = 1 — 2 ^ C(2m) 


x 


2m 


n 


2m * 


(5) 


We expand the left-hand side of Equation (5) in another way. Recall 


e lx + e 
2~~ 


COS X 


IX 


and sin x 


e lx — e~ lx 


2i 


From these expressions we derive 

2ix 

x cot x = ix 4- 


e 


2ix 


+ 


2 


(2ix) 1 

n\ 


⑹ 


Comparing coefficients of x 2m on the right-hand sides of Equations (5) 
and (6) yields 

C(2m) = ( ~ 1)m (2my B2 - 


n 


2m 


This is Euler’s result. 


As examples, take m = 1, 2 and 3. Since B 2 = *， = — jo, and B 6 — 
we find C(2) = n 2 /6, C(4) = tt 4 /90, and C(6) = tc 6 /945. 

A consequence of Theorem 2 is that ( — l) m+ 1 B 2m > 0 for m > 1. This is 
because C(2m) is a positive real number for such m. Thus, the even indexed 
Bernoulli numbers are not zero and alternate in sign. 

Theorem 2 also enables one to estimate the growth of B 2m , Namely, 
one sees 


I 心鍥. 



Here we have used the simple observation that C(2m) > 1. Using the 
obvious inequality e n > rf/n \ (look at the series expansion for e n ) we find 

|B 2 J>2(£) 2m (8) 

This shows that the even indexed Bernoulli numbers grow at a very rapid 
rate. A consequence which we will use later is | B 2n /2n | - > oo as « oo. 

We summarize the above properties of Bernoulli numbers in the following 
proposition. 


Proposition 15.1.1 

⑷ For k > 1 and odd, B k = 0. 

(b) (-ir +1 B 2m > 0/or m= 1,2, 

(c) \B 2m /2m\ co as m co. 


• • • • 
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The third problem that we discuss in this section deals with the relationship 
between Bernoulli numbers and the Fermat equation x p y p = z p . This 
discussion will be purely expository for the result of Kummer is quite deep 
and requires analytic techniques that we have not developed. However we 
will introduce the important notion of a regular prime and state the Claussen- 
von Staudt congruence which we will prove in the following section. First of 
all we introduce the notion of a p-integer. 

Let p be a prime number. A rational number reQis said to be a p-integer 
if ord p (r) > 0. In other words r is a p-integer if r = a/b, a, beZ and p Jf b. 
One also says with slight ambiguity that p does not divide the denominator 
of r. It is an important observation that the set of p-integers forms a ring. 
Denote this ring by Z p • If r and 5 belong to Z p write r = s (p n ) if ord p (r — s) > 
n, or equivalently, if r — 5 = a/b, p )(b and p n \a, a, beZ. The following 
theorem proved independently by T. Claussen and C. von Staudt describes 
the denominator of B 2m . No such complete description of the prime divisors 
of the numerator is known. 

Theorem3. For m > 1, B 2m = A lm — Xp-i| 2 m VP ^here A lm elL and the 
sum is over all primes p such that p — 11 2m. 

Corollary. If p — 1 氺 2m then B 2m is a p-integer. If p — 1 |2m then pB lm + 1 
is a p-integer. More precisely if p — 1 \2m then 

ord(/?B 2w + 1) = ordp(B 2w + $) = 1 + ord(J5 2m + ^ > 1 

so that pB 2m 三 一 1 (p). Finally we notice that 6 always divides the denominator 
of B 2m , m > 1, since 2 — 1 and 3—1 divide 2. 

Kummer introduced the notion of a regular prime as follows. 

Definition. An odd prime number peZ is said to be regular if p does not 
divide the numerator of any of the numbers B 2 , B 4 , ..., B p ^ 3 . If p is not 
regular it is called irregular. The prime 3 is regular. 

By the corollary to Theorem 3, , B p ^ 3 are p-integers. Therefore 

p is regular if ord p B 2i = 0 for i = 1， • • • ， (p — 3)/2. It is easily seen that the 
units in Z p are precisely the elements x with ord p x = 0. Thus p is regular 
if B 2 , , B p - 3 are units in Z p . Equivalently p is irregular if some B 2h 

1 < i < (p — 3)/2 is a nonunit in Z p . The first irregular primes are 37 and 
59 for it is known that ord 31 (B 32 ) = 1 and ord 59 (B 44 ) = 1 [234]. The 
first few irregular primes are 37, 59, 67, 101 ， 103, 149 and 157. It was proven 
by Jensen in 1915 that there are infinitely many irregular primes of the form 
4n + 3. In the next section we give a short proof due to L. Carlitz (1953) 
that infinitely many irregular primes exist. It has not been proven that 
infinitely many regular primes exist. This is somewhat unfortunate in view 
of the following remarkable result of Kummer (1850). 
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Theorem 4. Let p be a regular prime. Then x p -{■ y p = z p has no solution in 
positive integers. 

Actually Kummer proved that Fermat’s conjecture is true if p does not 
divide the class number of Q(( p ). In other words the criterion is that for any 
nonprincipal ideal A in Z[C P ], is not principal. This condition is equivalent 
to the regularity of p. We will not prove this, but the material in the third 
section of this chapter is closely related. 

CL. Siegel has given a plausible argument to suggest that the density of 
irregular primes is 1 — e~ 1/2 = 0.3935 … W. Johnson has checked this for 
primes less than 30,000 with good results [159]. S. Wagstaff has established 
the validity of Fermat’s conjecture for all primes less than 125,000 [234]. 
Furthermore the information found by Johnson has now been extended by 
him to all primes less than 125,000 [234]. 

If a prime p is irregular one can ask how many nonzero Bernoulli numbers 
in the set {B 2 , J5 4 , … ， B p - 3 } are divisible by p. This number is called the 
index of irregularity of p. The first prime of index 2 is 157. One of the most 
remarkable discoveries made with the aid of the computer is the existence 
of two primes of index 5 [234]. Finally we point out that thus far no pair p, 
i — 3)/2 has been fovnid for \vhich ordp ^ 2 i For the 
above remarks and their relation to the celebrated Iwasawa invariants see 
the paper by W. Johnson in the bibliography. 


§2 Congruences Involving Bernoulli Numbers 


We will now prove a number of arithmetic properties of the Bernoulli 
numbers. 

To begin with we direct our efforts toward proving Theorem 3 of the 
preceding section. Notice that for m > /c one has 


m 


m 


m 


m 


+ 


as follows immediately from the definition of the binomial coefficients. Thus, 
Theorem 1 of the last section becomes 


rn / M «\ 


+ l—k 


⑼ 


Now, using (X) = (m-fc) we see that 


^m( n ) = Z 

k = 0 


m 


B. 


n 


k + 


k k 


B, 


(m\_ n 
: ( 1 J 万 m — 1 ' 


2 


+ 


+ 


n r 


2 m 十丄 

In addition to Equation (10) we need the following simple lemma. 


( 10 ) 
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Lemma 1. Let p be a prime number and k > 1 an integer. Then 

(a) p k /(k 4 - 1) is p-integral. 

(b) p k /(k 4 - 1) = 0 (p) ifk>2. 

(c) p k ~ 2 /(k 4 - 1) is p-integral if k > 3 and p > 5. 

Proof. To prove (a) we show that fc + 1 < p k for k > 1. If fe = 1 the result is 
true. If k + l < p k then k-\-2<p k -\-l< 2p k < p k+1 . Now write fc + 1 = 
p a q where (q, p) = 1. Then p k /(k + 1) = p k ~ a /q. Since p k /(k + 1) > 1 we 
conclude that k > a, i.e., we have proven (a). To prove (b) we notice that 
k + 1 < p k for k > 2. The proof is the same as for (a). Therefore k > a 
which proves (b). 

As for part (c) use induction to show that k + 1 < p k ~ 2 for /c > 3 and 
p > 5. This time one concludes that k — 2 > a, so that p k ~ 2 /(k + 1) = 
p k ~ 2 ~ a /q is p-integral (and in fact divisible by p). □ 

Proposition 15.2.1. Let p be a prime and m > 1 an integer. Then pB m is p- 
integral. If m > 2 is even then pB m 三 S m (p) (p). 


Proof. The first assertion states that if p divides the denominator of B m then 
p 2 does not. First of all, pB l = —p/2 which is indeed p-integral for all p. 
We proceed by induction. 

Suppose m > 1. Applying Equation (10) with « = p we see that, since 
S m (p) e Z, it suffices to prove that 



( 11 ) 


is p-integral for /c = 1, 2,..., m. By induction pB m - k is p-integral for fc > 1. 
Also by Lemma 1, part (a), p k /(k + l)is p-integral. It follows pB m is p-integral. 
To establish the congruence it is enough to show that 



for k > 1. 


By Lemma 1, part (b) this is true fork > 2. For/c = 1 we need to show 


ord p (g(pU 卜 1, 

which is also true since m is even. Actually, for m even, x = 0 for m > 4, 
and so it is only necessary to check it for m = 2 where it is obvious. 匚 

Lemma 2. Let p be a prime. Then if p — 1 氺 m, S m (p) = 0 (p). If p — 1 \m 
then S m (p) = - 1 (p). 

Proof. Let 分 be a primitive root modulo p. Then 

心 ⑻ =+ 2 w + … + (p - i) w 

三 l m + 扩 + + … + g ip ~ 2)m (p). 
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Thus (g m — l)S m (p) = g m(p ~ ”一 1 三 0 (p). If p — \ )( m then g m 笋 1 (p) and 
S m (p) = 0 (p). On the other hand, if p — 1 |mthen S m (p) = = 

p — 1 = 一 1 (p)> 匚 

We are now in a position to prove Theorem 3. Assume m is even. Then by 
Proposition 15.2.1 we know pB m is p-integral and pB m = S m (p) (p). By the 
lemma just proven it follows that B m is a p-integer if p — 1 氺 m and pB m = 
一 1 (p) if p — 11 m. Thus 

— + X 一 

p-l\mP 

is a p-integer for all primes p. It follows that A m eZ and the proof is complete. 

The reader may suspect by this time that the consequences of Equation 
(10) have not been exhausted. The following proposition is another important 
consequence of that equation. Write the mth Bernoulli number B m : = UJV m 
where (U m , V m ) = 1 and V m > 0. We are assuming m to be even. 


Proposition 15.2.2. If m is even，m > 2 then for all n > l we have 
Proof. Consider the terms in Equation (10) for fc > 1 and fixed n 




n 


k-l 


k 


n 2 = A^n\ 


2 


k 


( 12 ) 


We will show that for p\n and p # 2, 3 ord p (A^) > 0. Furthermore if 2|n 
then ord 2 (Afc) > —1 and if 3|n then ord 3 (A^) > —1. This will imply that 
the greatest common divisor of n and the denominator of is a divisor of 
6 and thus this will also be true of the sum of the In other words one 
can write 

S m (n) = B m n + —, 

where (B, n) — l and /|6. Multiplying by BV m and recalling that 6| 匕 by 
Corollary to Theorem 3 the result follows immediately. 

In order to prove the ord p estimates we use the Corollary of Theorem 3 
which implies that ord p (B m - k ) > — 1 for all m — k > 0 and all p. Assume 
first of all that p ^ 2, 3, p\n. The cases fe = 1， 2 are simple by inspection 
using the fact that B t = 0 for t > l and odd, and that B x = —j, and that 
ord p 3 = 0. If fc > 3, then 


ord p ( B m 


n 


k-l 


k 


k 


^ — 1 -f- (h 一 1) ordp n 一 ordp(fe + 1) 


^ k — 2 — ordp(/c + 1) 2 0 


(13) 


by part (c) of Lemma 1. 

Consider now p = 2. If k = 1 then B m - X = 0 for m > 2 (m is even) 
while for m = 2, XJ 1 becomes 2 - ^ ^ which has ord — 1. For fc > 1 
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we notice that B m _ k = 0 unless k is even or /c = m — 1. But k even implies 
ord 2 (/c + 1) = 0 while for fc = m — 1 ， AZ-\ = which has ord 2 

greater than or equal to — 1. 

Finally consider the case p = 3, 31 n. Then ord 3 (^ 2 ) > —1 and ord 3 (^ 3 ) > 
1 as one easily checks. But for fc > 4 one shows exactly as in the lemma that 
ord 3 (3 卜 2 /(/c + 1)) > 0 so that ord 3 (Ak) > 0. This completes the proof. □ 

As a simple numerical illustration of this proposition consider B 2 = i ， 
l/ 2 = 1, K 2 = 6 and let n = 6. The congruence reads 

6(1 2 + 2 2 + 3 2 + 4 2 + 5 2 ) 三 6 (36) 

and more generally 

6(1 2 + 2 2 + … + (n — l) 2 ) = n (n 2 ). 

Corollary. Let m be even and p a prime such that p — 1 m. Then 

S m (p) = B m p(p 2 ). 

Proof. By Theorem 3, pj In the proposition, put n = p, and divide 
both sides of the resulting congruence by V m which is permissible since 
p X V m . The result follows. □ 

We are now in a position to prove the very useful congruences of G. 
Voronoi. According to the book of Uspensky and Heaslet [230], Voronoi 
discovered these congruences in 1889 while still a student. 

Proposition 15.2.3. Let m > 2 be even and define U m and V m as in the last 
proposition. Suppose a and n are positive integers with (a, n) = l. Then 

(a m - l)U m ^ 尸― 1 ^ ⑻， （14) 

Mi l n ] 

where [a] is the unique integer k such that fc < a < fc + 1. 

Proof. For l <j < n write ja = q } n + r 7 - where 0 < rj < n. Then Q/a/n]= 
qj and since (a, n) = 1 the two sets {1 ， 2, 3, •.. ， n — 1} and {r 1? r 2 ,..., 
are identical. By the binomial theorem 

j m a m 三 r] 1 + mq^nr] 1 — 1 (n 2 ). 

Since rj = ja (n) we have 

j m a m = r7 4 - rna m ~ l n J — 

J n 

Summing over j = 1,2,... , n — 1 gives 

M - 1 厂 •- 

S m (n)a m = 〜⑻ + rna m - 1 nX； m ' 1 - (n 2 ). 

i=i l n . 

The result now follows from the congruence of Proposition 15.2.2. □ 
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Corollary. Let p be a prime, p = 3 (4). Set m = (p + 1)/2. Then if p > 3 




(P )， 


where (x/p) denotes the Legendre symbol. 


Proof. Notice m — 1 = (p — 1)/2 so by Euler’s criterion a mTl = (a/p) (p) 
for all integers a. 

In Voronoi’s congruence set a = 2 and n = p. Using the above remark 
we find 



Now, [_2j/p] = 0 for 1 < 7 < m — 1 and [2//p] = 1 for m <j < p. Also, 
2m = 1 (p) and p J(" V m by Theorem 3. Thus 



Since 0 /p) = 0, the proof is complete. □ 


This corollary can be used to prove an interesting result relating class 
numbers to Bernoulli numbers. Let p be a prime, p = 3 (4) and consider the 

imaginary quadratic number field 0(^/ —p). Let h denote its class number. 
It can be shown that if p > 3 



For a proof, see Chapter 5, Section 4 of the book by Borevich and 
Shafarevich [9]. Combining the corollary with this formula for h gives the 
following remarkable congruence. 

h 三 -2B (p+1)/2 (p). 

The Voronoi congruences lead to many properties of Bernoulli numbers. 
The following proposition is often attributed to J. C. Adams. It gives some 
information about the numerator of B m . 

Proposition 15.2.4. If p — 1 氺 m then BJm is a p-integer. 

Proof. By Theorem 3, B m is a p-integer. Write m = //m 0 where p m 0 . 
In the Voronoi congruence, Equation (14), put n = p\ Then (a m — l)U m = 
0 (〆)• Choose a to be a primitive root modulo p. Since p — 1 氺 m we have 
p Jf a m — 1. Thus, U m = 0(p r ), and BJm = UJmV m is a p-integer. □ 
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As a numerical illustration take m — 22 and p = 11. Then B 22 = 
11 • 131 • 593/2 • 3 • 23 so B 22 /22 is integral at 11. Indeed it is a unit at 11. 
As a further example take m = 50 and p = 5. One can factor B 50 as follows 


D _ 5 • 5 • 417202699 - 47464429777438199 
50 = 2-311 

Clearly, B 5O /50 is a unit at 5. Less clear is the fact that the 17 digit number 
in the numerator is a prime! 

The following theorem in the case e = 1 is due to Kummer. These con¬ 
gruences are now referred to as the Kummer congruences. 


Theorem 5. Suppose m > 2 is even，p a prime, and p — 1 水 m. Define C m = 
(1 — p m ~ ^BJm. If m' = m (</>(p e )) we have C m ，三 C m (p e ). 

Proof. Write, as usual, B m = UJV m . Let t = ovd p m. Proposition 15.2.4 
shows p t \U m . In Equation (14) set n = p e+t . Sincedivides both m and U m 
we may divide the resulting congruence throughout by p\ Since (m///)K m 
is prime to p we arrive at the following congruence 


(a m ~ l)B m 


p e+t -1 


a' 


- 1 


m 


I r 1 


+ t 




(〆)• 


(15) 


This congruence will lead the way to a full proof of the theorem. We will 
give the proof first in the case e = l. This case reveals the main idea, which 
is quite simple, and avoids a slightly messy calculation which is necessary 
when e > 1. 

In the above congruence assume e = L On the right-hand side we may 
omit those j which are divisible by p. If p j, then j p ~ 1 = 1 (p). Also, since 
p)(a, a p ~ l = 1 (p). Thus modulo p the right-hand side is unchanged if 
we replace m by m with m f = m(p — 1). It follows that 

( ， - l)B m ' {a m -\)B mr ^ 

-- = - 

m m 

Choose a to be a primitive root modulo p. Since — 1 氺 m we have 
a m — 1 = a m — 1 ^ 0 (p). Consequently, 



m' 



B, 


m 


(P\ 


When e > 1 this procedure must be modified because the terms involving j 
divisible by p are not so easily disposed of. What we do is to separate them 
out and rewrite the corresponding sum. More precisely, 

pe + t — j 

=I r l 

j=i 
(p, j) = 1 


+ t - 1 


I r 


- 1 


j 


p 


e 
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Consider the congruence (15) with e replaced by ^ — 1 and recall that 
m — 1 > 1. We find 


V m ~\a m - 1 此 
m 



Putting all this together, yields 


(1 - p m ~ l )(a m - l)B m 


a r 


-1 


P e 


m 


I r 


_ i 


p 


J0_ 

e + t 


(〆)• 


(P, j) = 1 


(16) 


If p j, and m! = m (</>(〆)）then j m， ~ l = j m ~ 1 (p e ). Thus the right-hand 
side of (16) is unchanged modulo p e if m is replaced by m r with m r = m ((/>(〆))• 
The proof now proceeds exactly as in the case e = 1 and yields the full 
result. □ 


We make a short detour to indicate a modern interpretation of the 
Kummer congruences. 

Recall the Riemann zeta function (( 5 ) = n~ s . In Exercise 25 of 
Chapter 2 we mentioned that C(s) can be extended to a function holomorphic 
on the entire complex plane except at s = 1 where it has a simple pole with 
residue 1. Moreover, it can be shown that (( 5 ) satisfies the functional equation 

C(1 - s) = 2(27c 厂 cos ( 警 ) ro)co). 

The r-function is defined and discussed in Chapter 16, Section 6. All we 
require here is the fact that r(m) = (m — 1)! when m is a positive integer. 

Assume m > 2 is an even integer. Combining the above functional 
equation with Theorem 2 we find 


C(l 




m 

Define C*(s) = (1 — p~ s %(s). Then (*(1 — m) = —(1 — p m ~ l )B m /m and 
Theorem 5 states that ifm f = m then 

C *(l _ m)(/70 (17) 

For a fixed prime p, the function d(n, m) = p -ordp(n-m) defines a metric 
on Z, the p-adic metric. In this metric two integers are close if their difference 
is divisible by a high power of p. The congruence (17) may be stated informally 
as follows: if m' and m are close p-adically, and m! = m{p — 1), then (*(1 — m r ) 
and (*(1 — m) are close p-adically. This suggests the possibility of extending 
C* to the metric completion of Z, the ring of p-adic integers. These ideas 
were made precise by H. Leopoldt and T. Kubota who were the first to 
construct p-adic zeta functions and investigate their properties. Since then 
many other approaches have been devised. In the method due to B. Mazur 
the Bernoulli numbers are expressed as a certain p-adic integral of the 



§3 Herbrand’s Theorem 


241 


functions x m . The Kummer congruences have a very natural proof in this 
context. For details the reader is referred to Chapter 2 of [162]. The truly 
remarkable fact that properties of p-adic zeta functions (and p-adic L 
functions) are intimately related to the structure of class groups of cyclo- 
tomic fields is due to K. Iwasawa. Iwasawa gives a rather condensed and 
austere account of his theory in his monograph [155]. Another exposition 
of these matters is found in S. Lang [167]. 

We conclude this section with an application of Theorem 5. Namely 
we will prove that there exist infinitely many irregular primes. This proof 
is due to L. Carlitz [105]. 


Theorem 6. The set of irregular primes is infinite. 

Proof. Let {/? l5 ..., /? s } be a set of irregular primes. We will find an irregular 
prime not in this set. 

Let k > 2 be even and set n = lc(p' — 1) •.. (p s — 1). If the set is empty 
choose n = k. By Proposition 15.1.1, part (c)，choose k so large that \ BJn\ > 
1. Choose a prime p with ord p (B n /n) > 0. By Claussen-von Staudt p — 1 氺 n. 
Thus/? # p h i *1 ， 》 » 攀 ， S • Also p 參 2. We will show that p is irregular. 

Let n = m(p — 1) where 0 < m < p — 1. Theii m is even and m # 0. 
Thus 2 < m < p — 3. By the Kummer congruence 


Bn 

n 



B ， 


m 


(P). 


Since ord p (BJn) > 0 and ovd p (B n /n — BJm) > 0 it follows that 

% 


ord 


p\ 


m 


ord p B m >0 


which shows that p is irregular. 




§3 Herbrand’s Theorem 

Let D m be the ring of algebraic integers in the cyclotomic number field 
Q(C m ) and let P be a prime ideal of D m not containing m. Thus if p is the 
rational prime in P then p )( m. In Section 3 of Chapter 14 we associated 
to P a Gauss sum g(P) and showed g(P) m = 0>(P)eD m . The Stickelberger 
relation proved in Theorem 2 of that section gave the prime ideal decomposi¬ 
tion of O(P) in D m ，namely 

(0( 尸 )）= 

Here the exponent is an element of the integral group ring Z[G] of the 
Galois group G of Q(C m ) and t ranges over the integers between 1 and m 
which are relatively prime to m. The automorphism o t sends to d We 
remind the reader that the above exponential notation is a shorthand for 
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(O(P)) = n(“w)=i (江 「 HP))、If 乂 is an ideal relatively prime to m then A 

l<t<m 

is a product of prime ideals not containing m. It follows that A lt<Tfl is prin¬ 
cipal. The following proposition will be needed. We postpone the proof 
until later. 

Proposition 15.3.1. Let K be an algebraic number field and let M be a fixed 
ideal in the ring of integers of K. Then every ideal class of K contains an ideal 
prime to M. 

If a is in the group ring Z[G] where G is the Galois group of Q(C m ) then a 
operates in the obvious way on the ideal class group of Q(C m ). The above 
proposition implies that if a = [ then a sends every ideal class to the 
identity class. One says that a annihilates the class group. It is natural to 
ask if there are other such elements of the group ring. Further annihilating 
elements are given below. First we need a definition. 

Definition. The element 6 = where t runs over a set of re¬ 

presentatives for the residue classes relatively prime to m, is called the 
Stickelberger element. Here 〈 "m〉denotes the fractional part of t/m, which 
depends only on the residue of t modulo m. 6 is an element of the rational 
group ring (Q[G]. If b is an integer prime to m let r b = (a b - b)6. 

The following proposition, whose proof we will postpone, is very important. 

Proposition 15.3.2. The elements r b are in Z[G] and annihilate the class group. 

We will see later that this proposition follows without much difficulty 
from the Stickelberger relation. 

With these preliminaries and assumed propositions in mind we proceed 
to the principal goal of this section, the statement and proof of Herbrand’s 
theorem. 

Let m = /, an odd prime. Roughly speaking, Herbrand’s theorem states 
that if / does not divide a certain Bernoulli number, then a piece of the class 
group of Q(Ci) is missing. To make this statement precise we need a few 
definitions. 

Let be the subgroup of the ideal class group of Q(C/) consisting of 
elements whose order divides /. In other words, an ideal class is in if it 
contains an ideal whose /th power is principal. 

Definition. Let 1 < i < / — 1. Define 

= {Ae^\A 0t = A l \ 1 <t<l}. 

It is easily seen that each is a subgroup of Also, since each element 
of has order dividing /, the exponents can be computed modulo /, i.e., 
^ is acted on by the group ring Z///[G]. If f e Z we denote by l its residue 
class modulo /. 
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Lemma 1. is the direct product of the In other words, s4 = … 

1 and n s/j = e (the identity class) if i 妾 j. 

Proof. For each i with 1 < i < / — 1 we define elements e^eZ/ZZfG] by 
the formula 

£ i = ~ Yj ^~ l(T r 

t= 1 

Replacing t by ts in the formula leads to the relation = s^i provided 
that / 氺 5. It follows that K s^ { . On the other hand, if ^ e then 

— = 义 _ (卜 i ) = y 4 

It follows that = s^i ， 

By Lemma 2 of Section 2 we see that + £ 2 + * * • + £ /-i = ^i, the 
identity automorphism. Thus 

^ =， l 4 ■…切 - 1 

= S^ e ' x S^ Vl ' - - ^ £,_1 = ^2 … 4 - 1 . 

Suppose i ^ j and Aes^ Then A at = A tl = A tJ . We can choose t 

to be a primitive root modulo /• Then t l ^ t j (/). Since A tl ~ tJ = e and A has 
order dividing / it follows that A = e. □ 

The following theorem of J. Herbrand [149] gives a Bernoulli criterion 
for the triviality of The proof emerges from the interplay of the Stickel- 
berger relation and the Voronoi congruences. 

Theorem 7 (J. Herbrand). Let i be an odd integer 1 < i < I and define j by 



Then x — (e). Ifi>3 and l)( Bj then = (e). 

Proof. Let A e x . Then, by Stickelberger’s relation 

e = A Lt ^ 1 =A 1U - 1 = A 1 - 1 =A~K 
This shows x = (e) as asserted. 

Now suppose i is odd and 3 < i < / — 2. Let A e By Proposition 
15.3.2 A rh — e where b is any integer prime to /. We analyze this relation 
more carefully. 

By definition r b = (a b — b)9. Now, 

M = x< r / , > (7 fr <j r 1 = = YA bt / l X, 

Thus 
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Write bt = q t l + s t with 0 < s t < l. Then (bt/l) — b(t/l} = sjl — bt/l = 
— q t =— [bt/l]. This shows r h = [bt/[](T ~ 1 e Z[G]. 

Suppose Applying to A has the effect of raising A to the 

power = t j- K Thus, applying r b to A has the effect of raising A to 

the power — [ [bt/l]t j ~ 

Write Bj = Uj/V 』with (U j9 Vj) = 1. The Voronoi congruence, Pro¬ 
position 15.2.3, shows after some relabeling 

(b J ^ 1)Vj ^ 

t=i L 1 _ 

By the previous observations the right-hand side of this congruence 
annihilates any element A e Thus, for such an element A (bJ ~ 1)Uj = e. 
Choosing b to be a primitive root modulo / we see / ^ fe 7 — 1 and so A Uj = e. 
If / )( Bj, then / )( Uj and so A = e. Thus / )( Bj implies s^ { = {e) as asserted. 

□ 

We remark that the converse of Herbrand’s theorem was established by 
K. Ribet in 1976 [208]. Namely, he showed that if j is even and 2 <j < 
l — 3 then l\B i implies i=- (e) for i = l — j. This beautiful existence 
theorem depends on subtle arithmetic properties of modular forms and is, 
unfortunately, beyond the scope of this book. 

Write sd— where s4^ = … si 1 and s^~ — si^ • -- 

Then ^ — s^~ and n s^~ == (e) (see Exercise 23). The 

theorem of Herbrand implies \s^~ | = 1 if 1 )(for j = 2, 4 , …， l — 3. 
This was already known to Kummer who also showed, in essence, that 
\s^~ I = 1 implies \s / + 1 = 1. Thus, as we mentioned earlier, Kummer 
showed that / \ Bj for j = 2, 4, ...，/ — 3 implies the class number of Q(Ci) is 
not divisible by /. 

One of the most famous open problems in algebraic number theory is 
the conjecture of H. S. Vandiver. This states that the group of the 
previous paragraph is always trivial. It is not too hard to show this is equi¬ 
valent to the assertion that the class number of Q(( z + Cr 1 ) = Q(cos(27t//)) 
is not divisible by /. Vandiver made this conjecture around 1920. See his 
article on Fermat’s Last Theorem [231]. If true the conjecture has many 
important consequences. S. Wagstaff has shown Vandiver’s conjecture 
is true for all primes less than 125,000. This seems to be impressive evidence, 
but Larry Washington has shown on probabilistic grounds that 125,000 
is too small for the evidence to be convincing. 

We conclude this chapter by giving proof of Propositions 15.3.1 and 
15.3.2. 

We begin with Proposition 15.3.1. Let K be an algebraic number field 
and D its ring of integers. Let M c ： D be a fixed ideal. For any ideal A in D 
let A denote its ideal class. Given A we will construct an ideal C such that 
(C, M) = 1 and A~ 1 = C. This shows the inverse of any class contains an 
ideal prime to M. Thus every class contains an ideal prime to M. To construct 



§3 Herbrand’s Theorem 


245 


C we proceed as follows. Let {P l9 P l9 ..., PJ be the set of primes dividing 
M which do not divide A. This set may be empty. If P|v4 let, as usual, a(P)= 
ord P A denote the exponent of P in the prime decomposition of A. Choose 

n(P) e P a(P) - P fl(P)+1 . 

By the Chinese Remainder Theorem we can find an aeD such that 

a = n(P) fovP\A 

a = 1 (Pi) for i = 1, 2,..., L 

One checks easily that (a) = AC with (C, M) = 1. Thus A 1 = C and 
the proof is complete. □ 

Finally, we turn to the proof of Proposition 15.3.2. We will need the 
following lemma which is proven in the same way as the special case m = l 
done during the proof of Theorem 7. 

Lemma 2. Let G denote the Galois group of Q(C m )/Q. The element r b = 
(a b — b)6 e Z[G]. In fact, h = — X [bt/m](T ^ 1 where the sum is over 1 < t < 
m with (t 9 m) = 1. 

Let P be a prime ideal in D m the ring of integers in Q(C W ). Assume m 牵 P 
and let P n Z = (p). As in Section 3 of Chapter 14 we associate a Gauss 
sum g(P) to P. We know g(P)e Q(C m , C P ) = Q(C P m). 

Lemma 3. Let b be an integer prime to m. Determine V by the conditions 
b f 三 b (m) and b' = 1 (p). Let o h > be the corresponding automorphism of 
Q(C P J. Then 

g(PY-- b eQ(C m y 

Proof. The automorphisms of Q(C pm ) which leave fixed are of the form 
g c where (c, pm) = 1 and c = 1 (m). Let 

似尸 ) = g(Pr，_ b . 

We will show Q b (PY c = Q b (P). This proves, by Galois theory, that Q b (P) e 

Q(U. 

Recall that g(P) = ^ % p (0^(0 where the sum is over a reduced residue 
system modulo m. Since x p (t) e Q(C m ) and Q(C P ) we have 

g(P) ab， = Z x P (t) b Ht) 

and 

g(py b，ac = I x P (tfm c 
=Z h ⑴ V(a). 


Thus 


g(P) ab，ac = x P (c)~ b g(Py b， - 


⑴ 
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Similarly, we find 

g{P) ac = Ucr l g{Py ⑵ 

Raising both sides of (2) to the bth power and dividing the result into 
Equation (1) give ^ h (P) ac = Q b (P) as asserted. □ 

We are now in a position to complete the proof of Proposition 15.3.2. 
Let P c ： D m be a prime ideal not containing m. Stickelberger’s relation 
asserts that g{P) m e Q(C m ) and {g{P) m ) = P md . Applying a w — b to both 
sides shows that (Q b (P) m ) = P mrb . By Lemmas 2 and 3 above, this becomes, 
in D m , the equation (Q b (P)) m = (P rb ) m . It follows from unique factorization 
for ideals that P rb = (Q b (P)). Thus P rb is a principal ideal and therefore 
A rb is principal for any ideal A relatively prime to (m). By Proposition 
15.3.1 we conclude that r b annihilates the class group of D m . This completes 
the proof. □ 


Notes 

In 1960 Vandiver published a survey article in which he remarks that some 
1500 papers on Bernoulli numbers had been published [232], Clearly, 
this sequence of numbers has considerable fascination and importance. 
The most extensive treatise that has appeared on Bernoulli numbers is the 
classic by N. Nielsen [199]. A more accessible modern source is the first 
two chapters of the book on analytic number theory by H. Rademacher 
[204]. This book has an exposition of the Euler-MacLaurin summation 
formula, an important application of the Bernoulli numbers which we have 
not considered. 

The evaluation of C(s) at the positive even integers by Euler was a major 
accomplishment. It is surprising that almost nothing is known about the 
values of C(s) at positive odd integers. In 1978 the French mathematician 
R. Apery created a sensation by finding an extraordinarily ingenious proof 
that C(3) is irrational. See the entertaining article by A. van der Poorten 
[233]. 

The relation of Bernoulli numbers to Fermat’s Last Theorem and the 
arithmetic of cyclotomic number fields is very close as is evident from the 
numerous references to them in the scholarly book by P. Ribenboim [206]; 
see, in particular, Section 2 of Lecture VI. The short expository article by 
Vandiver [231] is also worth consulting. 

The paper [159] by Johnson has a very readible discussion of regular and 
irregular primes and mentions a number of interesting open problems. 
We follow his brief history of the calculation of irregular primes. Kummer 
himself determined that 8 of the first 37 primes were irregular. In the 1930s 
Vandiver and others extended the calculation to all primes less than 618. 
In 1955 Vandiver, D. H. Lehmer, Emma Lehmer, Selfridge, and Nicole 
worked up to 4001. In 1964 Selfridge and Pollack announced computations 
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up to 25,000. These were not published. In 1970 Kobelev published tables up 
to 5500 and in 1973 Johnson attained 8000. In 1975 Johnson made it up 
to 30,000. As stated earlier, the current record is due to Wagstaff, 125,000. 
The art of computing has come a long way! 

The following result was discovered independently by T. Metsankyla 
[188] and H. Yokoi [247]. Let m > 2 be an integer and H a proper subgroup 
of U(Z/mZ). There exist infinitely many irregular primes p such that the 
congruence class of p modulo m is not in H. By contrast, there is not a single 
modulus m > 2 known for which there exist infinitely many irregular 
primes p = I (m). 

The main theorem of Section 3 was published by Herbrand in 1932 
[149]. A proof relying on /?-adic numbers and congruences for generalized 
Bernoulli numbers can be found in Ribet’s paper [208]. See also Chapter 1 
of Lang’s book [167]. There are a number of important conjectures which 
concern the p-primary component of the class group of Q(C P ). The introduc¬ 
tion to the paper of A. Wiles [242] describes a conjecture which makes 
Herbrand’s theorem more precise. 

Exercises 

1. Using the definition of the Bernoulli numbers show B l0 = 5/66 and B 12 — 
-691/2730. 

2. If a eZ 9 show a(a m — l)B m e Z for all m > 0. 

3. If a € Z, show a m (a m — \)BJm e Z for all m > 0. 

4. If m > 4 is even, show 2B m = 1 (4). 

5. If p is an odd prime and p — l\m show (B m + p— 1 — l)/m is p-integral. This result 
is due to L. Carlitz.* 

6. For m > 3, show \B 2m+2 \ > \^ 2 m\- (Hint: Use Theorem 2.) 

7. Let m > 2 be an even integer. Show there exist infinitely many n > m such that 
B n — B m e Z. IHint: Let 分 be a prime such that q = I ((m +1)!) and try n = qm. 
The existence of infinitely many such primes q is shown in Chapter 16. This result 
is due to R. Rado.] 

8. Consider the power series expansion of tan(x) about the origin; 

oo Y 2fc— 1 

y r k -^ - . 

么 k (2k-l)\ 

Show T k = { — l) k ~ l (B 2k /2k)(2 2k — l)2 2k . Note that T k eZ for all k by Exercise 3. 

9. Using Lemma 1 in Section 1 show the radius of convergence of B n (t n /n !) is 
2n. As a consequence show that for any C,/c > 0 there are infinitely many n such that 
\B n \> Cn k . (This result is weaker than the estimate given by Equation (8) of 
Section 1. On the other hand, it is much easier to obtain.) 


* L. Carlitz. Some congruences for the Bernoulli numbers. Amer. J. Math., 75 (1953), 163-172. 
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10. Use the Voronoi congruences to obtain the following result of Kummer. 


r 


1( —l) k 

k = 0 



^2n + k(p- 1) 

2n -h k(p — 1) 


= 0(〆) 


provided 2 < r \ < 2n and p — 1 )(2n. This is a bit tricky. With minor changes 
in notation the proof is contained in Section 8 of Chapter IX of the book by 
Uspensky and Heaslet [230]. 


11. Those familiar with the approach of B. Mazur to p-adic zeta and L-functions can 
try the following. Let // a be the normalized “Mazur measure” on Z p . Use the 
Voronoi congruences to prove J* p x k ~ 1 d^i a = (a~ k — 1)(1 — p k ~ ^( — BJk). For 
the notation and the definition of the Mazur measure the reader can consult 
Koblitz [162]. 


12. Recall the definition of the Bernoulli polynomials; 

B m (x) = l o (:>〆 

Show that te'W - 1) = Em=o B m (x)(t m /rn\). 

13. Show B m (x + 1) — B m (x) = mx m ~ 1 . 


14. Use Exercise 13 to give a new proof of Theorem 1. 

15. Suppose f(x) — Yl=o a k xk is a polynomial with complex coefficients. Use Exercise 
13 to find a polynomial F(x) such that F(x + 1) — F(x) /w. 

16. For n > 1 show (d/dx)B n (x) = nB n _ ^x). 

17. Show B n (\ - x) = (~\) n B n (x). 

18. Use Exercises 13 and 17 to give a new proof that B n = 0 for n odd and n > l. 

19. Suppose n and F are integers and n,F>0. Show that 

(Hint: Use Exercise 12.) 

20. Suppose H(x) is a polynomial of degree n with complex coefficients. Suppose that 
for all integers F > 0 we have H(Fx) = F n ~ 1 ^=o -h (a/F)). Show that 
H(x) = CB n (x) for some constant C. (Hint: Use Exercise 16 and induction on n.) 

21. Show B„(i) = (1 - 2 n ~ l )B n . 

22. More generally, show that (1 — F n ~ l )B n = B n (a/F). 

23. Prove the assertions; — and n — (e). 



Chapter 16 

Dirichlet L-functions 


The theory of analytic functions has many applications in 
number theory. A particularly spectacular application was 
discovered by Dirichlet who proved in 1837 that there are 
infinitely many primes in any arithmetic progression 
b, b + m, b + 2m, • . • ， where (m, b) = 1. To do this he 
introduced the L-functions which bear his name. In this 
chapter we will define these functions，investigate their 
properties，and prove the theorem on arithmetic pro¬ 
gressions. The use of Dirichlet L-functions extends 
beyond the proof of this theorem. It turns out that their 
values at negative integers are especially important. We 
will derive these values and show how they relate to 
Bernoulli numbers. 

For the most part we will use only basic calculus. How¬ 
ever, in Section 6 where we discuss the value of the L- 
functions at 1 we use complex function theory in an 
essential way. This can be avoided but to do so involves 
sacrificing both depth and elegance. All the necessary 
background can be found in any standard treatise. The 
book of L. Ahlfors [85] is a convenient reference. In 
Sections 1-4 the letter s will stand for a real variable, 
s > l. 


§1 The Zeta Function 


The Riemann zeta function C(s) is defined by C(s) = J^ =1 n~ s . It converges 
for s > 1 and converges uniformly for s > 1 + (5 > 1, for each ^ > 0. 


Proposition 16.1.1. For 5 > 1 

c ⑻ = n(i — m 

p 

where the product is over all primes p > 0. 

Proof. For s > 1, p~ s < 1, so we have (1 - p~ s )~ 1 = P~ ms - By the 

theorem of unique factorization 

f](l - p-T 1 = Z n ~ S + 

p<N n<N 
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Clearly, R n (s) < ^ =N+1 n~ s . Since C(s) converges,.R N (5) — OasiV — oo.The 
result follows. □ 


The behavior of C(s) as s — 1 is very important. Since ^n= i n ~ 1 diverges 
we, of course, suspect C(s) oo as 5 - ► 1. In fact, 

Proposition 16.1.2. Assume 5 > 1. Then 

lim (s - l)C(s) = 1. 

s-^ 1 

Proof. For fixed s, t~ s is a monotone decreasing function of t. Thus, 


(n + l)~ s < 


dt < n_ s . 


Summing from n = 1 to oo, 


C(s) - 1 < 


dt < C(s). 


The value of the integral is (s — 1) 一 1 • It follows that 1 < (s — l)C(s) < s. 
Taking the limit as s 1 gives the result. □ 


Corollary. As s l we have 

In C(s) 
ln( S - I)- 1 

Proof. Let (s - l)C(s) = p(s). Then ln(s 一 1) + In C(s) = In p(s) so we have 
In C(s)/ln(s — 1)— 1 = 1 + (In p(s)/ln(s — 1) _1 ). 

As 5-^1, p(s) 4 1 by the proposition. Therefore, In p(s) 0 and the 
result follows. □ 


Proposition 16.1.3. In C(s) = + 尺⑻ where R(s) remains bounded as 

5—1. 

Proof. We use the formula — ln(l -x)= x + x 2 /2 + x 3 /3 + … which is 
valid for — 1 < x < 1. 

By Proposition 16.1.1 we have 

as)= nu — m ⑻， 

P<N 

where A N (s) — 1 as iV oo. Taking the logarithm of both sides yields 
In C(s)= m - 1 广 + In X N (s). 

Taking the limit as iV — oo 

00 

In C(s) = [ E 咐 _1 P _ms 

p m = 1 

oo 

p p m = 2 
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The second sum is less than 

00 

z wm 1 -p _s ) _i 

p m= 2 p 

<(1 -2- s )- 1 Ip' 2s <2C(2). 

p 

Throughout we have used the assumption that 5 > 1. 


Definition. A set of positive primes ^ is said to have Dirichlet density if 


lim 


In (卜 1) 


- 1 


exists. If the limit exists we set it equal to d(^) and call d(^) the Dirichlet 
density of 


Proposition 16.1.4. Let ^ be a set of positive prime numbers. Then 

(a) If 梦 is finite，then d{^) = 0. 

(b) If ^ consists of all but finitely many positive primes，then d(^) = 1. 

(c) If ^ u ^here and are disjoint and and d(^ 2 ) both 

exist，then d ( 少 ） = + d(^ 2 )- 

Proof. Parts (a) and (c) are clear from the definition of Dirichlet density. 
Part (b) follows quickly from the corollary to Proposition 16.1.2 and Proposi¬ 
tion 16.1.3. □ 

We are now in a position to state the main theorem of this chapter. The 
proof will be spread out over the next three sections. 

Theorem 1 (L. Dirichlet). Suppose a,me Z, with (a, m) = 1. Let 少 (a; m) be the 
set of positive primes p such that p = a (m). Then d ( 淨 (a; m )) : =V 伽 ). 

Note that Theorem 1 certainly implies rti) is infinite, since if it were 
finite its density would be zero. 


§2 A Special Case 

We will first prove Theorem 1 in the case where m = 4. The basic ideas of the 
proof are all present in this special case but the details are more transparent. 

Define a function x from Z to {0 ， ±1} as follows; x(") = 0 if /i is even, 
x(n) = 1 if n = 1 (4), and x( n ) = -1 if w = 3 (4). It is easily seen that 
X(nm) = x( m )x( n ) f° r fn, n G Z. 
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Define L(s, x) = x( n ) n ~ s = 1 — 3 _s + 5 _s — 7 _s + ••• • For all n 
we have |x(«)« _s | < n~ s . It follows that the terms of L(s, x) are dominated in 
absolute value by the terms of C(s). Thus L(s, x) converges and is continuous 
for 5 > 1. Since x is completely multiplicative the proof of Proposition 16.1.1. 
shows that 

Us, x) = I \( 1 - x(P)P~ s r 1 ^ 

p 

It is useful to modify C(s) so as to suppress the even terms. Define C*(s )= 
["odd n~ s . Since 

00 

C(s) = In- + X n ~ S = ⑽ + 2 — S C ⑻ 

n = 1 nodd n even 

we have C*(s) = (1 — 2 _s )C(s) and so 

c*(s) = n(i — m 

podd 

Using the method of proof of Proposition 16.1.3 we find 

In L(5, %) = X X(P)P~ S + 尺 l ⑻， (0 

podd 

In = X p~ s + R 2 (s), (ii) 

podd 

where Ri(s) and jR 2 (s) remain bounded as s — 1. 

We have 1 + x(p) = 2 Up = 1 (4) and 1 + x(p) = 0 if p = 3 (4). Similarly, 
1 — x(p) = 2 if p = 3 (4) and 1 — x(p) = 0 if p = 1 (4). From (i) and (ii) we 
deduce 


In C*(s) + In L(s ， x)= 

= 2 1 P~ s - 

P= 1 (4-) 

卜 尺 3 ⑻， 

(iii) 

In C *( s ) — In L ( s ， x)= 

= 2 Z p- s H 

卜 尺 4 0 )， 

(iv) 


P= 3 (4) 


where R 3 (s) and R^(s) remain bounded as 5 1. 

The next step is to show that In L(s, x) remains bounded as 5 -> 1. To see 
this write L(s, x ) = 1 - 3~ s + 5~ s —— =(1 - 3~ s ) + (5~ s - 7— s ) + ... 
=1 — (3— s — 5— s ) — (7 _s — 9 _s ) —… .It follows that for all s > 1 we have 
f < L(s, /) < 1. Thus, for s > 1 we have In f < In L(s, /) < In 1 = 0. 

As a final preparatory step we note that In (*(s) = ln(l — 2~ s ) + In C(s) so 
by the corollary to Proposition 16.1.2. we have In C*(s)/ln(s — 1)— 1 1 as 

s - ► 1. 

Now divide each term of Equations (iii) and (iv) by ln(s — 1)— 1 and take 
the limit as s — 1. The result is 
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Proposition 16.2.1. d(^(l ; 4)) = 士 and 4)) = j. 


To prove Theorem 1 in the general case we need to generalize x and 
L(s ， x). This leads to the introduction of Dirichlet characters and Dirichlet 
L-functions. 


§3 Dirichlet Characters 


The function / considered in the last section can be obtained from the follow¬ 
ing construction. Consider the group U(Z/4Z). This group has two elements 
1 + 4Z and 3 + 4Z. Define U(Z/4Z) {±1} by /’(l + 4Z) = 1 and 

X’(3 + 4Z) = —1. Then X' is a homomorphism from [/(Z/4Z) to C*. For 
neZ define /(n) = 0 if (n, 4) > 1 and x( n ) = x( n + 4Z) if (n, 4) = 1. This 
function Z ^ C* coincides with the function x of the last section. 

This construction is easy to generalize. Let m be a fixed positive integer. 
Let x ’： U(Z/mZ) C* be a homomorphism. Given / define Z C* as 
follows; if (n, m) > 1 set /(n) = 0, if (n, m) = 1 set x( n ) — x( n + The 
functions x defined in this manner are called Dirichlet characters modulo m. 
Another characterization is given by the following three conditions on a 
function / : Z — C* 

(a) x( n + = x(n) for all n € Z 

(b) 乂 (kn) = z(/c)/(n) for all k,neZ 

(c) x( n ) 0 if and only if (n, m) = 1. 

It is an easy exercise to see that these three conditions specify the set of 
Dirichlet characters modulo m. 

To investigate the properties of Dirichlet characters we begin by studying 
a more general problem. 

Let A be a finite abelian group (written multiplicatively). A character on A 
is a homomorphism from A to C*. The set of such characters will be denoted 
by A. If 乂 , i// e A define 冰 to be the function which takes aeA to x(a)\j/(a). 
Then is also a character. We show that this product makes A into a group. 
Define / 0 , the trivial character, by Xo( a ) = 1 for dAXae A.lf xe A define x — 1 
by x~ l (ci) = x ( a )~ 1 for all ae A.lt is easily seen that e A and xx~ 1 = Xo- 
With these definitions A becomes an abelian group with as the identity 
element. We omit the more or less obvious details. 

Let n be the order of A.liae A, then a n = e, the identity element of A. So, if 
XE A we have x(a) n = 1 ， i.e., the values of x are nth roots of unity. It follows 
that /(a) = i{a)~ 1 = x~ x ( a X where bar denotes complex conjugation. Thus 
1 is sometimes written x and called the conjugate character of x- 
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Two questions present themselves immediately. How big is A1 What is its 
structure? The questions are easy to answer when A is cyclic. In the general 
case we will use a theorem from group theory which asserts that a finite 
abelian group is a direct product of cyclic groups (see I. Herstein [150]). 
When A = U(Z/mZ), the case of interest to us, this result follows from Theorem 
3 of Chapter 4. 

Suppose that A is cyclic and generated by an element g of order n. Let 
C n = e 2ni/n . If x € .4 we have /(g) = C„ for some uniquely determined integer e 
such that 0 < e < n. Since x(g m ) = x(d) m ^ X ^ determined by its value at g. 
Conversely, if 0 < ^ < n define x{g m ) = CIT. It i s easy to see / is well defined 
and is a character. Thus there are exactly n characters on A. Let g A be such 
that Xi(9) = If and i(g) = C e „, then x(g) = Xi(d) which implies 
X = Xv This shows that A is cyclic and generated by Xv Thus A ^ A. 

In general X is a direct product of cyclic groups. This means that there are 
elements g l9 g 2 ,..., g t s A such that 

(i) The order of g { is 

(ii) Every element as A can be uniquely written in the form a = g7 l 02 2 ''' 97' 

where 0 < < n { for all i. 

If the order of A is n, then clearly n = n x n 2 •. • 

Suppose ie A. Then x is determined by the values x(Qi) = C ； where 0 < 
e t < Hi ， Conversely, given a 卜 tuple {e u e 2 , with 0 < e t < for all i we 

can define a character x as follows. Forae A write a = g^gj 2 ... as in (ii) 
and set x( a ) ― 广 .It can easily be checked that % is a character. 

There are thus n x n 2 • -n t = n characters on A. Moreover, let Xi be specified by 
the conditions Xi(gd = C ni and Xi(gd = 1 for / ^ Then Xi has order n t and A is 
the direct product of the cyclic subgroups generated by the This shows 
A A. 

The next two results will be of importance in the next section. 

Proposition 16.3.1. Let Abe a finite abelian group. If 乂 , e A and a,b e A, then 

(i) T.aeA / ⑷爾 = nS(x ， where S(x ， i) = 1 and d(x, ♦) = 0 if 乂妾 ij/. 

(ii) Yjx^A x( a )x(^) = W(a ， b) where S(a, a) = l and d(a, b) = 0 if a ^ b. 

Proof. Since [ flei4 乂 = [ fle/ 4 # _1 (a) it will be enough to show (i) 
that we can prove i{a) = nif^ = Zo and J^ aeA x(a) = 0 if x # Xo. The 
first assertion is clear by definition. Assume x ^ Xo- Then there isabe A such 
that x(b) ^ 1. We have /(«) = Z« ^ ba ) = ^ b ) and so (x(b) - 1) 

X(a) = 0. Since x(b) -1^0 this implies /(a) = 0. 

To prove (ii) we first note that Z ⑷ = Yjx x(^ _1 ) - It suffices to 
show Yjx x( a ) = n ii a = e and j{a) = 0 if a ^ e. The first assertion is 
clear. Assume a 半 e. We claim there is a character \j/ such that 妾 L 
To see this write a = g^g 2 2 … 分 with 0 < for all i. Since a ^ e at 

least one m f ^ 0. Then Xi( a ) = Xi(Gi) mi = Cn- ^ 1- Take \j/ = Xi- Then, 
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Ex = Zx 办⑷ =A(a) 咖 ） and so (i//(a) — 1) L x(a) = 0. Since 
\l/(a) — 1 0 we have x( a ) = 0 . □ 


The relations given by (i) and (ii) are called the orthogonality relations. 
We now interpret these for Dirichlet characters modulo m. Here we take 
A = U(Z/mZ). Dirichlet characters are defined on Z but induce and are 
induced by elements in the character group of U(Z/mZ). Hence there are 
exactly 0(m) Dirichlet characters modulo m. From the definition and the last 
proposition we deduce 

Proposition 16.3.2. Let 乂 and if/ be Dirichlet characters modulo m, and a, b e Z. 
Then 

(0 Z«=o = (Krn)d(x, i/z), 

(ii) Zz X(^)x(b) = <Km)d(a, b). 

In part (ii) the sum is over all Dirichlet characters modulo m, and d(a, b) = 1 
if a = b (m) and S(a, b) = 0 if a 羊 b (m). 


§4 Dirichlet L-functions 


Let x be a Dirichlet character modulo m. We define the Dirichlet L-function 
associated to 乂 by the formula 

L(s, x) = f l{n)n~ s . 

fl= 1 

Since |x(n)n~ s | < n~ s we see that the terms of L(s, x) are dominated in 
absolute value by the corresponding terms of C(s). Thus L(s, x) converges 
and is continuous for 5 > 1. Moreover, since x is completely multiplicative we 
have a product formula for L(s, x) in exactly the same way as for C(s). Namely, 

L(s, Z) = II ( ! - X(P)P~T 1 - 

p 

Since x(p) — 0 for p\m the above product is over positive primes not 
dividing m. The formula is valid for s > 1. 

There is a close connection between L(s, Xo) and C(5). In fact, 

L(s, Xo) = EU 1 — P~ s ) _1 

=U( i -p~ s )iK i -p~ s y 1 

p|m p 

P\m 
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From Proposition 16.1.2 we see linipi (s — l)L(s, Xo) = I\p\m C 1 — P _1 )= 
</>(m)/m. In particular L(s ， Xo) — 00 as s — 1. 

To generalize the proof of Proposition 16.2.1 we will need to consider 
In L(s, %). Even if we restrict 5 to be real, the values of L(s, x) are in general 
complex so it is necessary to worry about the fact that In z is multivalued as a 
function of a complex variable z. One way around this is to define In L(s, x) by 
an infinite series. 

Let /be a Dirichlet character and define G(s, x) = Zp i (l/k)x(P k )P 一 ks . 
Since \(l/k)x(p k )p~ ks \ < p~ ks and since C(s) converges for s > 1 and con¬ 
verges uniformly for 5 > 1 + ^ > 1 we can conclude the same assertions are 
true for G(s, %). Consequently G(s, x) is continuous for s > 1. Moreover, for z 
a complex number with \z\ < 1 we have exp(^°°= 1 ( 1 A)^) = (1 — z) _1 , 
where exp denotes the usual exponential function. Substituting z = x(p)p~ s 
we find Qxp(Y,k =1 (l/k)x(p k )p -ks ) = (1 — x(p)P~ s )~ 1 and a simple argument 
then shows exp G(s, x) = L(s ， x) for all s >1. Thus the infinite series G(s, x) 
provides an unambiguous definition for In L(s, j). To avoid confusion we 
work directly with G(s, /). 

From the definition and the argument used in the proof of Proposition 
16.1.3 we find 

X)= T, X(P)P~ S + W 

p|m 

where R x (s) remains bounded as 5 - ► 1. Multiply both sides of (i) by j{a) 
where ae Z, (a, m) = 1. Then sum over all Dirichlet characters modulo m. 
The result is 

X P\m x X 

Using Proposition 16.3.2, part (ii), we see 

Z 硕 G(s ， x) = 4>{m) X P~ s + 尺 ; t ， fl ⑻， ⑻ 

X p = a(m) 

where R Xta (s) remains bounded as s — 1. 

To conclude the proof of Theorem 1 we need the following proposition. 

Proposition 16.4.1. If Xo denotes the trivial character modulo m, then 
G(s, / 0 )/ln(5 — 1)— 1 = 1. If x is a nontrivial Dirichlet character modulo m, then 
G(s, x) remains bounded as s l. 

Proof. The first assertion is easy. L(s, Xo) i s a real valued function of 
positive real numbers. We have seen L(s, Xo) — Y[p\m (1 — P~ s )C(s)- It follows 
that G(s, Xo) = Yjp\m ln(l — p~ s ) + In C(s). The assertion now follows from 
the corollary to Proposition 16.1.2. 

The second assertion is quite deep. It is the most difficult part of the proof 
of Dirichlet’s theorem on arithmetic progressions. We postpone the proof to 
the next section. 

Now, assuming the above proposition, the proof of Dirichlet’s theorem 
follows quickly from Equation (ii). We simply divide all the terms on both 
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sides by ln(s — 1)_ 1 and take the limit as s — 1. By the above proposition, the 
limit on the left-hand side is 1 whereas the limit on the right-hand side is 
<j)(m)d ( 淨 (a; m)). Thus d(^(a; m)) = 1/ 命 (m) and we are done. □ 


§5 The Key Step 

Up to now all our functions have been defined for 5 > 1. We will show how to 
extend the domain of definition to s > (X In particular, if x is nontrivial we will 
see that L(l, x) is a well defined complex number and prove that L(l, x) ^ 0- 
This is the key step. Once we know this it is a relatively simple matter to show 
G(s, x) remains bounded as 5 - > 1. This was what was left unproved in 
Section 4. 

In what follows we will consider s as a complex variable. Write s = 
cr + it where o and t are real. The symbol o will be used throughout to denote 
the real part of s. 

If a > 0 is real then |a s | = a a . From this observation we see that the series 
defining C(s) and L(s, x) converge and define an analytic function of the 
complex variable s in the half plane {sg C|cr > 1}. 

Lemma 1. Suppose {a n } and {b n } for n = 1, 2, 3,... are sequences of complex 
numbers such that J]n = i a n b n converges. Let A n = a x a 2 + - - + a n and 
suppose A n b n 0 as n oo. Then 

00 00 

工 a n b n = [ A. n (b n 一 

n = 1 /?= 1 

Proof. Let S N = i a nb n - Set A 0 = 0. Then 

n=1 n= 1 n=1 

N N-l 

= ^ — Z ^n^n+ 1 

n=1 n= 1 

N- 1 

= — 办 n+1). 

n= 1 

Taking the limit as iV — oo yields the result. 匚 

Proposition 16.5.1. C(s) — (5 — 1) _ 1 can be continued to an analytic function on 
the region {s € C|cr > 0}. 

Proof. Assume cr > 1. Then, by the lemma 

00 00 

= Z n ~ s = Z n ( n ~ s — (« + l)' s ). 

n = 1 n = 1 
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For a real number x recall that [x] is the greatest integer less than or equal 
to x and <x) = x — [x]. From the above expression for C(s) we find 

oo / •n + 1 

C(*s) = s « x _s_ 1 dx 

n= 1 *^n 



s r°° 

= - s <x)x _s_ 1 dx. 

S — 1 Ji 


Since I <x) | < 1 for all x the last integral converges and defines an analytic 
function for <r > 0. The result follows. □ 

We will use the same technique to extend L(s, x) but first we need another 
lemma. 


Lemma 2. Let 乂 be a nontrivial character modulo m. For all N > 0 we have 

lZn=o X(n)\ < 0(m). 

Proof. Write N = qm + r where 0 < r < m. Since x{n + m) = x(n) for all n 
we see 

N /m— 1 \ r 

Z = q( T, + Z ^)- 

n=1 \n = 0 J «= 0 

By the Proposition 16.3.2, (part i), we have [IT;。 1 x( n ) = 0- Thus, 

N r m— l 

Zx( n ) = ^ Z lx(«)l = □ 

n = 0 n = 0 n = 0 

Proposition 16.5.2. Let x be a nontrivial Dirichlet character modulo m. Then, 
L(s, x) can be continued to an analytic function in the region {5 g C|cr > 0}. 

Proof. Define S(x) = x(n\ 

By Lemma 1 we have for cr > 1, 

00 

L(s， X )= X S(n)(n~ s - (n + l)— s ) 

n= 1 

00 / *n+ 1 

=s ^ S(n) x~ s_1 dx 

n—\ Jn 

广 00 

=s S(x)x ~ s ~ 1 dx. 

Ji 
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By Lemma 2, | S(x)\ < </>(m) for all x. It follows that the above integral 
converges and defines an analytic function for all s such that d > 0. □ 

Our goal is to show that for x nontrivial L(l, x) ^ 0. The next proposition 
will enable us to give a simple proof in the case where x is a complex character, 
i.e., a character which takes on nonreal values. 

Proposition 16.5.3. Let F(s) = L(s ， x) ^here the product is over all 
Dirichlet characters modulo m. Then, for s real and s > l we have F(s) > 1. 

Proof. Assume s is real and s > 1. Recall that 

g(s, %) = E Z 1 x(p k )p~ ks - 

p k=l K 

Summing over x and using Proposition 16.3.2, part (ii), we find 

X G(s, x) = (t>M X \ p~ ks 

x K 

where the sum is over all primes p and integers k such that p k 三 1 (m). 

The right-hand side of the above equation is nonnegative (in fact, it is 
positive). Taking the exponential of both sides shows Yix L( s ， Z) > 1 as 
asserted. 匚 

Proposition 16.5.4. If x is a nontrivial complex character modulo m, then 

m ， x) # o. 


Proof. From the series defining L(s, x) we see that for s real, s > 1, L(s, x)= 
L(s, /). Letting s tend towards 1 it follows that L(l, x) = 0 implies L(l, x) = 0. 

Assume L(l, x) = ^ where x is a complex character. The functions L(s, x) 
and L(s, x) are distinct and both have a zero at s = 1. In the product F(s)= 
f] z L(s, x) we know L(s, Xo) has a simple pole at5 = 1 and all the other factors 
are analytic about 5 = 1. It follows that F(l) = 0. However, Proposition 
16.5.3 shows F(s) > 1 for all real s > 1. This is a contradiction. Therefore, 
L(l x)^0. □ 

It remains to consider the case where ^ is a nontrivial real character, i.e., 
X(n) = 0, 1， or — 1 for all neZ. Dirichlet was able to prove L(l, ^ 0 by 
using his class number formula for quadratic number fields (to be more 
accurate, for equivalence classes of binary quadratic forms of fixed dis¬ 
criminant). We will use an elegant proof due to de la Vallee Poussin (1896), 
following the exposition of Davenport [119]. 

Lemma 3. Suppose f is a nonnegative, multiplicative function on Z + , Le., for all 
m, n > 0 with (m, n) = 1, f(mn) = f (m)f (n). Assume there is a constant 
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c such that f(p k ) < c for all prime powers p k . Then f(n)n~ s converges 
for all real s > l. Moreover 

t /_— s = n 0 + z np k )p~ ks 

n = 1 p \ k= 1 

Proof. Fixs > 1. Leta(p) = Yjk=i f (p k )P~ ks - Then a(p) < cp~ s Yjk=o P~ ks = 
cp~ s (l — p~ s )~ and so a(p) < 2cp~ s . For positive x one has 1 4- x < exp x. 
Thus 

I"! (1 + a(p)) < f] exp a(p) = exp ^ 

p<N p<N p<N 

Now, Yjp<n a (p) < = M. From the definition of a(p) and the 

multiplicativity of / we see f(n)n~ s < (1 + a(p)). It follows that 

i f( n ) n ~ s < exp M for all N. Since / is, by assumption, nonnegative we 
have ^ =1 f(n)n~ s converges. 

The last assertion of the lemma follows from the same reasoning used in 
the proof of Proposition 16.1.1. □ 



Theorem 2. Let x be a nontrivial Dirichlet character modulo m. Then L(l, x) ^ 0. 


Proof. Having already proved that L(l, x) # 0 if / is complex we assume x is 
real. 

Assume L(l, x) = 0 and consider the function 


Hs)= 


LO, x)L(s, Xo) 

L(2s ， Xo) 


The zero of L(s ，/) at s = 1 cancels the simple pole of L(s, Xo) so the 
numerator is analytic on a > 0. The denominator is nonzero and analytic for 
(T > |. Thus \l/(s) is analytic on a > |. Moreover, since L(2s, Xo) has a pole at 
s = I we have — 0 as s — 

We assume temporarily that s is real and s > 1. Then \p(s) has an infinite 
product expansion 

构 ) = n U — x(p)p~T 1 (^ - Xo(p)p~T 1 (^ - Xo(p)p~ 2s ) 


(l -p~ 2s ) 


If x(p) 


p 

H (i - p _s )(i - x ( p ) p~ s y 

1 the p-factor is equal to 1. Thus 

1 +P~ 


Us) = n 

X(P) = 1 


p 


where the product is over all p such that x(p) = 1- Now, 


P 


P 


(1 + P _S ) IP _fcs 

\k = 0 




1 + 2p~ s + 2p~ 2s + •••+• 
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Applying Lemma 3 we find that 棒） =Yjn=i ci n n~ s where > 0 and the 
series converges for s > 1. Note that = 1. (It is possible, but unnecessary to 
give an explicit formula for a n ). 

We once again consider \j/(s) as a function of a complex variable and ex¬ 
pand it in a power series about s = 2, {j/(s)= [ 二 =0 b m (s — 2) m . Since \f/(s) is 
analytic for a > ^ the radius of convergence of this power series is at least f. 
To compute the b m we use Taylor’s theorem, i.e., b m = (^ (m) (2)/m! where 
is the mthderivative of Since ij/(s) = a„n~ s we find i// (m) (2)= 

E«°°=i a n( - ln n) m n~ 2 = (~l) m c m with c m > 0. Thus i//(s) = X«°°= o c m (2 - s) m 
with c m nonnegative and c 0 = i//(2) = a n n~ 2 > = 1. It follows that 

for real s in the interval (j, 2) we have \j/(s) > 1. This contradicts 0 as 

s — 1， and so L(l, x) ^ 0. □ 

We are now in a position to prove Proposition 16.4.1. Suppose x is a non¬ 
trivial Dirichlet character. We want to show G(s, x) remains bounded as 
s ^ 1 through real values s > 1. 

Since L(l, ^ 0 there is a disc D about L(l, x) such O^D. Let ln z be a 
single-valued branch of the logarithm defined on D. There is a <5 > 0 such that 
L(s, y)eD for se(l, 1 -h d). Consider ln L(s, x) and G(s, for s in this 
interval. The exponential of both functions is L(s, x)- Thus there is an integer 
N such that G(s, x) — 2niN 4- ln L(s ， z) for 5 6(1, 1 + 5). This implies 
lim^i G(s, x) exists and is equal to 2niN + ln L(l, %). Since G(s, x) has a limit 
as s -> 1 it clearly remains bounded. 


§6 Evaluating L(s, y) at Negative Integers 

In the last section we showed how to analytically continue L(s, x) into the 
region {5 e C | a > 0}. Riemann showed how to analytically continue these 
functions to the whole complex plane. As noted earlier this fact has important 
consequences for number theory. For example, the values L(1 — /c, /), where k 
is a positive integer, are closely related to the Bernoulli numbers. A knowledge 
of these numbers has deep connections with the theory of cyclotomic fields. 
We will analytically continue L(s, x) and evaluate the numbers L(1 — k, x) 
following a method due to D. Goss [141]. 

Before beginning we need to discuss some properties of the r-function. 
This is defined by 

广 00 

r(s) = e~ l t s ~ l dt. (i) 

Jo 

It is not hard to see that the integral converges and defines an analytic 
function on the region {s g C|(T > 0}. For a > 1 we integrate by parts and 
find 

00 广 00 

r(s) = —e—tf — 1 -f (5 - 1) e—tf — 2 dt 

0 Jo 
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It follows that r(s) = (s — l)r(s — 1) for a > 1. Since r(l) = e~ % dt 

=1 we see F(n -h 1) = n! for positive integers n. 

The functional equation r(s) = (s — l)r(s — 1) enables us to analytically 
continue r(s) by a step by step process. 

If > — 1 we define Ks) by 

r.is) 」 r(s + l). 00 

s 

For a > 0, = r(s). Moreover, is analytic on cr > — 1 except for 

a simple pole at s = 0. 

Similarly, if fe is a positive integer we define 

r * (s) = s(s + l).. l (s + k-l) r(S + k) - 

F k (s) is analytic on {s e C | (7 > 一 k} except for simple poles at s = 0, -1 ， …， 
1 — fe and r k (s) = r(s) for a > 0. These functions fit together to give an 
analytic continuation of r(s) to the whole complex plane with poles at the 
nonpositive integers and nowhere else. From now on T\s) will denote this 
extended function. We remark, without proof, that r(5) _1 is entire. 

We will now show how to analytically continue C(s) by the same process. 
It is necessary to express C(s) as an integral. In Equation (i) substitute nt for t. 
We find, for cr > 1 


►oo 


n~T(s) 


e~ nt t s ~ l dt. 


(iii) 


'o 


Sum both sides of (iii) for n = 1, 2, 3,.... It is not hard to justify inter¬ 
changing the sum and the integral. The result is 


r(5)C(5) 


•00 


0 


e 


f- 1 dt. 


e 


(iv) 


If we tried to integrate by parts at this stage we would be blocked by the 
fact that 1 — e~ l is zero when r = 0. To get around this we use a trick. In (iv) 
substitute 2t for t. We find 

，oo 2f 

2 1 -T( 5 )C(5) = 2 - ~~~ (V) 

Jo 1 - ^ 

Define C*(s) = (1 — 2 1_s )C(s) and R(x) = x/(l — x) — 2(x 2 /(l — x 2 )). 
Subtracting (v) from (iv) yields 

广 00 

r(s)C*(s)= dt. (vi) 

Jo 

What has been gained? A simple algebraic manipulation shows R(x )= 
x/(l -h x). Thus R(e~ l ) = e _f /(l + e~ f ) has a denominator that does not 
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vanish at f = 0. The integral in Equation (vi) thus converges for (T > 0 and 
this equation provides a continuation for C(s) to the region {sgC|(t > 0}. 

Let R 0 (t) = R(e~ f ) and for m > 1, R m (t) = (d m /dt m )R(e~ t ). It is easy to see 
that R m (t) = e~ t P m (e~ t )(\ + e~ t )~ 2m where P m is a polynomial. It follows 
that ⑼ is finite and is bounded as / — These facts enable 

us to repeatedly integrate by parts in Equation (vi). 

Take u = R(e~ f ) and dv = t s 一 1 dt. Then du = R^dt and v = t s /s. Thus 

r(sX*(s) = - t s R 0 (t) * - - ⑽ s dr 

S 0 5 Jo 

and so 

fOO 

r(s 4 - 1)^(5) = - R^f it. (vii) 

Jo 

The integral in (vii) converges to an analytic function in {s g C|cr > -1 }， 
and provides an analytic continuation of C(X) to this region. Continuing this 
process we find for k a positive integer 

产 00 

r(s + k)C*(s) = (- l) fc 尺 fc ⑺ f s+ 卜 1 办， (viii) 

Jo 

where the integral converges to an analytic function of s for cr > — k. This 
procedure provides an analytic continuation of C ⑻ to the whole complex 
plane. We continue to use the notation 〔 (s) for the extended function. 

Proposition 16.6.1. Let k be a positive integer. Then ， C(0) = —j and for 
k > 1 ， C(1 — fc) = —BJk where B k is the kth Bernoulli number. 

Proof. In Equation (viii) substitute s = 1 — k. The result is (*(1 一 k)= 
(— l) k 尺办冰 Since R k (t) = (dldt)Rk-i(t) we deduce (1 — 2 k X(l — fe)= 

By definition is the (k — l)st derivative of 



By Taylor’s theorem, R k _ l (0) is (fe — 1)! times the coefficient of t k ~ 1 in the 
power series expansion of this function about r = 0. Since f/(e r — 1)= 
2^=o (B k /fe!)r k we find C(1 — k) = ( — l) k ~ l B k /k. If fc = 1， then C(0) = B x = 
— 臺 . If k> l and odd, then Bk — 0. Thus for k> 1^(1 — k) = —BJk. [I 

Assume now that x is a nontrivial character modulo m. To handle L(s, y) 
we proceed in exactly the same way as for ((s). In Equation (iii) multiply both 
sides by x(n) and sum over n. The result is F(s)L(s 9 x) = Jo 5 F x (e~ t )t s ~ 1 dt, 
where 

oo m oo m p — at 

F x( e ~ t ) = Z X(n)e~ nt = e~ ia+km)t = X/(«) t _ - m r 

n=l a=l k = 0 a=l 1 ~ e 
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If we define L*(s，%) = (1 — 2 )L(s, xX then in the same way as we derived 

Equation (vi) we find 


Hs)L*(s, x) 


*00 


dt 


x 


(ix) 




0 


where 


R x (x) = F x (x) — 2F x (x 2 ) 


m 


Z x ⑷ 


X 


a 


a 


X 


m 


2 


x 


2a 


X 


2m 


m 


Z 


1 + x m - 2x a 


a 


(1 -x)(l + ••• + 〆' 


For each value of a we see x = 1 is a root of 1 4- x m — 2x a , and it follows 
that R x (x) has the form 




xf(x) 


1 -h X -h • - -h X 


2m— 1 J 


where/(x) is a polynomial. Let = = (^/dt^R^e -1 ). 

By repeated integration by parts we find in the same way that we derived 
Equation (viii) that 


T(sk)L*(s, x) = (~ if I R x , k (t)f 


_oo 


; + fc 一 1 


dt* 


(X) 


o 


The integral in (x) converges to an analytic function in {s e C|(T > —k}. 
These formulas provide an analytic continuation of L*(s, x) and thus L(s, x) 
to the whole complex plane. 

Before attempting to evaluate L(s ， y) at the negative integers we need a 
definition. 


Definition. Let / be a nontrivial Dirichlet character modulo m. The general¬ 
ized Bernoulli number B n x is defined by the following formula 


m at oo 

a= 1 匕 丄 n = 0 


B n, X 

n\ 


t n . 


(xi) 


In the literature it is usual to define B n x in this manner only if ^ is a primitive 
character modulo m. We will discuss this point later. 


Lemma 1. tF^) = (-1)”( 凡 ，>!>”. 

Proof. Simply substitute — f for f in Equation (xi). □ 


Proposition 16.6.2. Let k be a positive integer. Then L(1 — k ， 乂 ） = —B k x /k. 
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Proof. In Equation (x) substitute s = 1 — fc. The result is (1 — 2 k )L(l — k, x) 
= (—l) fc Jo 5 k(0dt. Since R x k (t) = (d/dt)R x k-iit) it follows that 
(1 - 2 k )L(l -k,x) = Since 

^k— 1 

尺 ; C ， fc-l( f ) = 办 fc -1 ^x( e 0 


and R^) = F x (e- f ) - 2F x (e~ 2t ) = (1/0 及 =1 (-l) k (l - 2 k )(B Kx /kl)t k 
(by Lemma 1) we see that ( — l) k_ 1 R X = —(1 — 2 k ){B k Jk). Thus, 

L(1 — k ， x) = — B k Jk as asserted. □ 

It follows from Equation (xi) that the numbers B k x are in the field generated 
over Q by the values of Thus, in particular, they are algebraic numbers. 

As mentioned earlier it is usual to define B n x by Equation (xi) only when% 
is a primitive character modulo m. This means that x when restricted to 
{n g Z\(n, m) = 1} does not have a smaller period than m. The trivial char¬ 
acter is primitive only for the modulus 1. From Equation (xi) we then have 


E 

n = 0 


D 

xo f n _ 






Thus —B l Xo = B x and B n Xo = B n for n 參 2. It is in this sense that the B n x 
are u generalized Bernoulli numbers.” 

The B n x have many interesting arithmetic properties. The interested 
reader should consult Chapter 2 of Iwasawa’s monograph [155]. This mono¬ 
graph is devoted to showing how the equation L(1 — K X、 = —B k ，Jk leads 
to p-adic L-functions and to the remarkable connection between these 
functions and the theory of cyclotomic fields. Another approach to these 
topics are the books of S. Lang [167] and [171]. More accessible to the novice 
than these works is the book of N. Koblitz [162]. 


Notes 

Legendre attempted, without success, to prove the existence of infinitely 
many primes in an arithmetic progression a -h bn, (a, b) = 1. Dirichlet 
states that, unable to overcome the difficulties in completing Legendre’s 
argument, he was subsequently led to study a class of infinite series and 
products analogous to those considered by Euler (see [124]). The results of 
Dirichlet’s investigation are far reaching for the development of algebraic 
and analytic number theory. In addition to proving the existence of primes in 
an arithmetic progression Dirichlet was able, using the analytic techniques he 
introduced, to derive explicit formulas，conjectured in part by Jacobi (see the 
Notes to Chapter 14)，for the class numbers of quadratic number fields. For 
example, if p is prime, p > 3 then the class number of is 

x) where x is the Dirichlet character associated to the Legendre 
symbol. The well-known expression ( —^ xx(x))/p for the class number is 
then obtained by deriving a closed form for L(l, x) (see [9], p. 343). This in 
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turn is obtained using the value of the classical Gauss sum. Since class 
numbers are positive we see that this approach shows L(l, x) ^ 0- 

If F is a Galois extension of Q of degree n then one may show by an 
extension of the methods of this chapter, that the set of prime numbers p that 
split completely in F, i.e., that are the product of n distinct prime ideals in F, 
has Dirichlet density 1/n. As a corollary it can be shown that if / (x) is an 
irreducible polynomial with integer coefficients then the set of primes p for 
which / (x) is the product of linear factors modulo p has density 1/n where n is 
the degree of the splitting field of / (x). 

The generalized Bernoulli numbers for quadratic characters appear in 
A. Hurwitz [153]. In this paper Hurwitz derives the functional equation for 
L(s, y), x quadratic, through consideration of the partial zeta functions 
l/(mr + a) s . The values at negative integers of these latter functions 
may be found by either the classical method or that of Goss, as done in this 
chapter. A suitable linear combination of these values then yields the ex¬ 
pression for L(1 — k, x) (Proposition 16.6.2). N. C. Ankeny, E. Artin, and 
S. Chowla also introduced generalized Bernoulli numbers for quadratic 
characters in connection with certain remarkable congruences relating the 
class number of a real quadratic field and the components of the fundamental 
unit [86]. The definition and basic properties of generalized Bernoulli 
numbers are given in H. Leopoldt [178] who employs them elsewhere to 
obtain a generalization, to arbitrary abelian extensions of Q, of Kummer’s 
criterion for the divisibility of the class number of Q(C P ) (see the comment 
following Theorem 4, Chapter 15). Leopoldt proves in this paper a theorem of 
the von Staudt-Claussen type of B n x . See also Carlitz [104] and the mono¬ 
graph on p-adic L-functions by K. Iwasawa [155]. 


Exercises 

1. Using the method of Section 2 compute the density of the set of primes congruent to 
1 modulo 3. 

2. Let Pi ，…， p„be primes congruent to 1 modulo 4. If p is a prime dividing (2 ^ =1 p,) 2 
+ 1 show that p = l (4) and p p h i = 1,..., n. 

3. Compute the set of Dirichlet characters modulo 8 and modulo 12. 

4. Let x be the nontrivial Dirichlet character modulo 3. Show that 

oo i 

L(l, y) = V - . 

n=o (3n + 1)(3 m + 2) 

Can you find the exact value of,L(l, /)? (See Exercise 8.) 

5. Use Theorem 2, Chapter 13 to determine the Dirichlet density of the set of primes p 
which factor into the product of 4 distinct prime ideals in the ring of integers in Q(Q, 
c = e 2nil5 . 

6. Generalize Exercise 5 to Q(C m ) for general m. 
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7. By considering modulo p give an algebraic proof that there are an infinite 
number of primes in the progression mk + 1, k = 1, 2, 3,… • 

8. Let g(x) be the classical Gauss sum xMC, X the Legendre symbol, C = 

e 2mlp , p prime. Define P = no — nnu — C) 1 where n, r run over respectively 
the nonsquares and squares modulo p. Show that 

P = exp(gf(x)L(l, x)\ 

9. Using Exercise 8 compute L(l, x) where x is the nontrivial quadratic character 
modulo 5. 


10. (Chowla) The notation being as in Exercise 8 show that F # 1 (and thus L(l, x) # 
0!!) as follows. Choose C a nonsquare modulo p. Prove that P = 1 implies 

/I — x Cr 

Obtain a contradiction by specializing x! 

11. Use Dirichlet’s theorem to show that Galois extensions of Q exist with any pre¬ 
scribed finite cyclic group as group of automorphisms. 

12. Derive the irreducibility over Q of the cyclotomic polynomial ❿ "(x) from Dirichlet’s 
theorem (Landau [166], Vol. 2). 



13. Let x be a Dirichlet character modulo m, /(2) # 0. Show 


L(5,x) = (l -2- X (2))- 1 Z 

w =0 


X(2n + 1 ) 
(2n + 1) T 


The following exercises adapted from Moser [193] give a short proof that there are more 
squares than nonsquares on the interval [1, (p — 1)/2] for p = 3 (4),p prime. In Exercises 
14, 15, 16, 17, p = 3 (4). 

14. Let p = 3 (4). Show that 


p-1 

z 

x= l 



15. Show that, using Exercise 14, 




sin(2ntm/p) 

m 


[Hint : replace x by nt and sum." 


16. Using the elementary fact from Fourier series 


00 


I 


sin(2n — l)x if 0 < x < 7r, 


2n 


— 7t/4, if 7r < x < 2 丌 , 


show that 


fiodd \pj ^ ^\fp 


'(P-D/2 / t \ 

?, Q 


p 一 i 

I 

(p + l )/2 \P/ 


n 


( P - 1)/2 


2^/p \P/ 
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17. Since Yj p =\ 1)I2 ("P) 參 0 (why?) conclude that ^ MOdd (n/p)fl/n) > 0 and thus 
Yj=~i l)/2 (t/p) > 0. Recall p = 3 (4). 

18. Let m > 2, (a, m) = 1. If a has order / in the group of units modulo m show that there 

are infinitely many primes p such that (p) = • • • P f , t = distinct prime 

ideals in Q(C m ). What is the density of this set of primes? 



Chapter 17 

Diophantine Equations 


In Chapter 10 we discussed Diophantine equations over 
finite fields. In this chapter we consider special Diophantine 
equations with integral coefficients and seek integral or 
rational solutions. The techniques used vary from elemen¬ 
tary congruence considerations to the use of more sophisti¬ 
cated results in algebraic number theory. In addition to 
establishing the existence or nonexistence of solutions we 
also obtain results of a quantitative nature, as in the 
determination of the number of representations of an 
integer as the sum of four squares. All of the equations 
considered in this chapter are classical，each playing an 
important role in the historical development of the subject. 


§1 Generalities and First Examples 

By a Diophantine equation will be understood a polynomial equation 

f(x u x 2 ,x 3 ,...,x n ) = 0, (1) 

whose coefficients are rational integers. If this equation has a solution in 
integers x u ..x n then we shall say that (x 1? ..., x„) is an integral solution. 
If (1) is homogeneous then a solution distinct from (0, … ， 0) is called non¬ 
trivial. A solution to (1) with rational x 1? ..., is called a rational solution. 
Clearly, in the homogeneous case the problem of finding a rational solution is 
equivalent to that of finding an integral solution. 

While the degree of f(x u ..., xj controls to some extent the difficulty of 
the problem, the existence or nonexistence of a solution is often related to 
subtle invariants and even perhaps the complex differential geometry of (1) 
over the complex numbers. 

We begin by considering the linear Diophantine equation 

a x x x + a 2 ;c 2 + ... + a n x n = m. (2) 

Here a u a n , m are rational integers. Then by Chapter 1 (see Exercises 6, 
13,14 of that chapter) it follows that a solution in integers exists iff the greatest 
common divisor of a l9 a n divides m. 

If n = 2 and d = (a u a 2 ) the Euclidean algorithm gives an explicit pro¬ 
cedure for constructing a solution to a 1 x l + a 2 x 2 = d (Exercises 2 and 4, 
Chapter 1). Multiplying the solution by m/d gives a solution to (2). For 
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n > 2 one may proceed by induction using the simple observation that 

((“1，• • . ， — l)? ^n) ~ (^1) • • • ， A). 

If (1) has an integral solution then for each prime p the congruence 

/(x) = 0 (p) (3) 

has a solution. If therefore one can find a prime p for which (3) has no solution 
then (1) also has no solution. This method can be applied in many special 
cases to obtain nonexistence theorems. We will consider several examples of 
this technique. 

For example, consider the equation 

= x 3 -h 7. ⑷ 

If (4) has a solution then x is odd. For otherwise reduction modulo 4 would 
imply that 3 is a square modulo 4 which is not the case. Write (4) as 

y 2 1 = (x -{■ 2)(x 2 — 2x + 4) (5) 

=(x + 2)((x - l) 2 + 3). 

Now since (x — l) 2 + 3 is of the form 4n + 3 there is a prime p of the form 
4n + 3 dividing it and reduction of (5) modulo p implies that — 1 is a square 
modulo p. But this contradicts Proposition 5.1.2, Corollary 3. Of course this 
ingenious argument works only because one chose x 3 -f 7. There are many 
results concerning the rational and integral solutions of the equation 

y 2 = x 3 k (6) 

for special values of k (see Section 10). The interested reader should consult 
Mordell [189] for an indicatiojn of the vast array of techniques used to discuss 
(6). We mention in passing that it follows from deep theorems of Mordell and 
Siegel that (6) has only a finite number of integral solutions. The question of 
rational solutions leads to the famous conjectures of Birch and Swinnerton- 
Dyer. A statement of these conjectures will be given in the next chapter. 
Consider next the equation 

y 3 - px 2. (7) 

Here p is a prime p = 1 (3). We note that this Diophantine equation is 
equivalent to the congruence 

/ 三 2 (p). (8) 

By Proposition 9.6.2, Equation (7) has a solution iff p = C 2 + 27D 2 for 
suitable integers C and D. Thus the Diophantine problem (7) is related to the 
question of the representability of p by the quadratic form x 2 -h 21y 2 . 

In a similar manner quadratic reciprocity can be used to show that 

少 2 = 41；c + 3 (9) 

has no solution. For reduction modulo 41 shows that 3 is a square modulo 41. 
But since 41 三 1 (4) quadratic reciprocity implies 41 is a square modulo 3 
which is not the case. 



§2 The Method of Descent 


271 


A well-known Diophantine equation is given by 

x 2 y 2 = z 2 . (10) 

The solutions in integers are known as Pythagorean triples. We solve this 
problem using Proposition 1.4.1 which states that Z[f] is a unique factorization 
domain. A proof that does not use complex numbers can be found, for 
example, in Hardy and Wright [40], p. 190. Assume that (10) has a solution 
and that (x, y) = 1. Thus x and y are not both even and reduction of (10) 
modulo 4 shows that z is odd. Factor (10) in Z[i] to obtain 

(x + iy)(x - iy) = z 2 . (11) 

If n is an irreducible in Z[i] that divides x + iy and x — iy then n divides 2x 
and 2y. Since z is odd (tt) / (1 + i) for otherwise nn = 2\z 2 . Thus n\x and 
n\y. Taking norms shows that N(n) = p\x and p\y which contradicts the fact 
that (x, y) = l.Thusx + f^andx — iy are relatively prime. If z = un^ - - - n a s % 
u a unit, is a factorization of z in Z[i] then, by unique factorization, 

x iy = u/i 2 . (12) 

Writing P = a + bi and taking u = 1 gives the solutions 

x = a 2 — b 2 , 

y = 2ab ， 

z = a 2 + b 2 . 

The other choices of the unit give essentially (i.e., up to sign) the same solution. 
The identity (a 2 — b 2 ) 2 + (2ab) 2 = (a 2 4 - b 2 ) 2 shows that (10) has in¬ 
finitely many solutions. The above argument shows that there are no others. 

We conclude this section by giving a simple example of a homogeneous 
cubic equation with no nontrivial solution. For any prime p consider 

x 3 + p〆 + p 2 z 3 = 0. (13) 

Assume that (13) has an integer solution (x, y, z\ x, y, z not all divisible by p. 
Then p\x 3 so p\x. Putting x = px f and cancelling shows that p\y 3 so that 
p\y. Substituting y = py f and cancelling shows that p\z 3 or p\z which is a 
contradiction. This elegant example is due to Euler (see Hurwitz [154], 
p. 455). 

§2 The Method of Descent 


This method, first enunciated by P. Fermat may be used to handle several 
important Diophantine equations. The technique is best illustrated by 
examples. Consider therefore the Diophantine equation 

〆 =z 2. (14) 
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We show that (14) has no integral solution with xyz / 0, z > 0. Assuming 
that (14) has such an integral solution we construct another solution with 
smaller positive z. This is clearly impossible as it leads to an infinite sequence 
of decreasing positive integers. The details are as follows. 

We may assume that (x, y, z) = l,z > 0. Next x and y cannot both be odd 
since otherwise reduction modulo 4 would give z 2 = 2 (4) which is impossible. 
Let then x be odd, y even so that z is odd. Write y 4 = (z — x 2 )(z -h x 2 ) and 
observe that, since any prime p dividing the two factors on the right must also 
divide 2z and 2x 2 , one must have (z — x 2 , z -h x 2 ) = 2. But the product of the 
two factors is a fourth power. The possibilities are therefore 

z — x 2 = 2a 4 ， a > 0 ， 

z -h x 2 = 8b 4 , 

a odd, (a, b) = 1, 

or 

z — x 2 — 8b 4 ， 
z -h x 2 = 2a 4 , a > 0 
a odd, (a, b) = 1. 

The first case implies x 2 = —a 4 + 4b 4 which is impossible since otherwise 
1 = — 1 (4). Thus (16) holds and z = a 4 4fc 4 . Note that 0 < a < z. Also 
eliminating z in (16) shows that 4b 4 = (a 2 — x)(a 2 + x). Since (a, b) = l it 
follows that (a, x) = 1 and arguing as earlier one sees that (a 2 — x, a 2 -h x) 
= 2. Writing a 2 — x = 2c 4 and a 2 -h x = 2d 4 one obtains 

a 2 = c 4 d 4 . 

Thus we have found a solution to (14) with smaller positive value for z and 
the proof is complete. □ 

In particular x 4 -I- = z 4 has no solution, xyz # 0. This is a special case 

of Fermat’s Last Theorem. 


(15) 


(16) 


§3 Legendre’s Theorem 

In this section we consider the Diophantine equation 

ax 2 -I- by 2 -f cz 2 = 0, (17) 

where a, b, c are square free, pairwise relatively prime integers. We would like 
to have necessary and sufficient conditions in order that (17) have a nontrivial 
integral solution. In order that a solution exist it is of course necessary to 
assume that a, b and c are neither all positive nor all negative. 



§3 Legendre’s Theorem 


273 


If m and n are nonzero integers iQt mRn denote the fact that m is a square 
modulo n. In other words there is an integer x with x 2 三 m (n). Legendre 
discovered the following beautiful theorem. 

Proposition 17.3.1. Let a, b，c be nonzero integers, square free, pairwise 
relatively prime and not all positive nor all negative. Then (17) has a nontrivial 
integral solution iff the following conditions are satisfied 

(i) —abRc. 

(ii) —acRb. 

(iii) —bcRa. 

It is convenient to prove this result in the following equivalent form. 

Proposition 17.3.2. Let a and b be positive square free integers. Then 

ax 2 + by 2 = z 2 (18) 

has a nontrivial solution iff the following three conditions are satisfied 

(i) aRb. 

(ii) bRa. 

(iii) — (ab/d 2 ) R d, where d = (a, b). 

In order to see that Proposition 17.3.2 implies Proposition 17.3.1 consider 
ax 2 + by 2 -f cz 2 = 0 as in Proposition 17.3.1 and assume that a and b are 
positive while c is negative. Then — acx 2 — bey 2 — z 2 = 0 is easily seen to 
satisfy the conditions of Proposition 17.3.2. If(x, y, z) is a solution then since c 
is square free c \ z. Putting z = ct! and cancelling we arrive at a solution to (17). 
That Proposition 17.3.1 implies Proposition 17.3.2 is left as an exercise. 

We now proceed to the proof of Proposition 17.3.2. \ia = \ the proposition 
is obvious. Furthermore we may assume a > b. For ifb > «just interchange 
x and y.lfa = b then by (iii) — 1 is a square modulo b. By Exercise 25 at the 
end of this chapter one can find integers r and s such that b = r 2 s 2 . A 
solution is then given by x = r, y = s, z = r 2 s 2 . 

With these preliminaries we proceed to construct a new form Ax 2 + by 2 
=z 2 satisfying the same hypotheses as (18),0 < A < a, and such that if it has 
a nontrivial solution then so does (18). After a finite number of steps, inter¬ 
changing A and b in case A is less than b we arrive at one of the cases A — \ 
ov A = b, each of which has been settled. Now for the details. 

By (ii) there exist, T and c such that 

c 2 — b = aT = aAm 2 ; A,meZ (19) 

where A is square-free, and \c\ < a/2. First of all we show that Q < A < a. 
This follows from (19) since first of all one has 0 < c 2 = aAm 2 + b < 
a(Am 2 -h 1). Thus ^ > 0. But since b is square-free ^ > 0 by (19). Further¬ 
more by (19) aAm 2 < c 2 < a 2 /4 so that A < Am 2 < a/4 < a. 
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Next we verify that A Rb. Put b = b^, a = a x d with {a u b x ) = 1 and 
note that (a l9 d) = (fe 1? d) = l since a and b are square-free. Then (19) 
becomes 

c 2 — b^d = a^Am 2 (20) 

and since d is square-free d\c. Put c = c x d and cancel to obtain 

del — bx = a 1 Am 2 . (21) 

Thus Aa^m 2 = —b l (d) or Aa\m 2 = (d). But (m, d) = 1 since by (21) 

a common factor would divide andd and thus b would not be square-free. 
Using (iii) and the fact that m is a unit modulo d we conclude that A Rd. 
Furthermore c 2 = aAm 2 (b。. Since a Rb one has a R b x . Also (a, = 1 

since a common divisor would divide d and contradicting the fact that 
b = b^d is square-free. Similarly (m, = 1 which shows that A R b x . By 

Exercise 26, A R db x or A Rb. 

Next write>4 = rA u b = rb 2 ,(A u b 2 ) = 1. We must verify that —A 1 b 2 R r. 
From (19) we conclude that 

c 2 — rb 2 = arA^m 1 . (22) 

But r is square-free so r|c. If c = rc l then 

aA^m 1 三 一 b 2 (r). 

Since a RbwQ have a Rr. Finally writing 

-aA l b 2 m 2 = bl (r) 

and observing that (a, r) = (m, r) = 1 we conclude —A 1 b 2 R r. 

Assume now that AX 2 -V bY 1 — Z 1 has a nontrivial solution. Then 

AX 2 = Z 2 - bY 2 . (23) 

Multiplying (23) by (19) one has 

a(AXm) 2 = (Z 2 - bY 2 )(c 2 - b) 

=(Zc + bY) 2 - b(cY + Z) 2 . 

(Note the use of the multiplicativity of the norm map on Q(-y/b )!). Thus (18) 
has a solution with 

x = AXm, 
y = cY^Z 9 
z = Zc ~h bY. 

This completes the proof since X ^ 0;andm ^ 0 as follows from the fact that 
b is square-free. □ 

An important corollary of Proposition 17.3.1 is a special case of the so 
called “Hasse Principle.” This principle states roughly that local solvability 
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implies global solvability. Here local solvability means that the equation 
under consideration has a nontrivial solution modulo p m for all primes p and 
all positive integers m, as well as a real solution while global solvability refers 
to a solution in integers. For quadratic forms this principle is true but it fails 
for equations of higher degree. For example, the equation x 4 — \ly A = 2z 4 
has a nontrivial solution modulo p m for all p and m, and a real solution, but it 
has no nontrivial solution in integers [205]. 

Corollary. Let a ， b，c be square-free，pairwise relatively prime integers 
not all of the same sign. If for each prime power p m the congruence 

ax 2 + by 2 +cz 2 =0 (p m ) 

has a solution in integers (x, y, z) not all divisible by p then ax 2 + by 2 + cz 2 
= 0 has a nontrivial integral solution. 

Proof. Let m — 2 and suppose p\a. Then if (x, y, z) is a solution as in the 
corollary we show that p\yz. For if p\y, say, then p\cz 2 which implies, since 
(a, c) = 1, that p\z. Thus p 2 \ax 2 and since p\x we obtain the contradiction 
p 2 \a. Similarly p\z. Thus by 2 cz 2 = 0 (p) and division (mod p) shows that 
— be R p. This being the case for every p\a it follows that —bcRa (Exercise 
26). Similarly -abRc and —acRb and the corollary now follows by 
Proposition 17.3.1. □ 


§4 Sophie Germain’s Theorem 

In Chapter 14 we proved that if Fermat’s equation for an odd prime p 

X P + yP Z P = 0 (24) 

had a solution with p\xyz then a very strong congruence held, namely 

2 p ~ l = 1 (p 2 ). 

In 1823 Sophie Germain proved the following remarkable result by com¬ 
pletely elementary considerations. 

Proposition 17.4.1. If p is an odd prime such that 2p + 1 = qis also prime then 
(24) has no integral solution with p\xyz, 

Proof. Assume on the contrary that such a solution exists and suppose that 
(x, y, z) 1. Write 

= (y + z)(z p ~ 1 — z p ~ 2 y + ... + y p ~ x ). (25) 

The two factors on the right are relatively prime. For clearly p\y + z and if 
r 7 ^ p is a prime dividing both factors then since y = — z (r) one has 

0 = z p ~ 1 — z p ~ 2 y + … + y p ~ 1 = py p ~ 1 (r\ 
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which implies that r|}；. This in turn implies that r | z (by (24)) contradicting the 
assumption that (x, y, z) = 1. By unique factorization in / we conclude that 

y -h z = A p (26) 

Z p-1 — ^-2^ + … = T p (27) 

for suitable integers A and T. Similarly 

x-h y = B p (28) 

x -h z = C p . (29) 

Since p = (q — 1)/2 reducing (24) modulo q gives 

x {q ~ 1)/2 + y (q ~ 1)/2 4 - z (q ~ 1)/2 = 0 ((?). 

If q'i.xyz then each of the terms on the left-hand side is 士 1 modulo q. This is 
impossible since q > 5. Thus, by symmetry, we may assume that q\x. From 
(26) ， (28) and (29) we conclude that 


so that 


B p C p - A p = 2x 


B iq ~ 1)/2 + C iq - 1)/2 - A (q ~ 1)/2 = 0 (q). (30) 

Once again it follows that q \ ABC. However, since q \ x, (28) and (29) imply 
that q I BC is impossible. Thus q\A. By (26) and (27) we see that 

T^py^^iq) 

By (28), y = B p (q); and since (A, T) = l, q XT. Thus, since p = (q ~ 1)/2 
we have ±1 = p (q) which is impossible. Thus the proof is complete. □ 


Unfortunately it is not known whether there are infinitely many “ Germain ’’ 
primes, i.e., primes p such that 2p + 1 is prime. The interested reader should 
consult Lecture IV in the book by Ribenboim [206]. 


§5 Pell’s Equation 

Let d be a positive square-free integer. The Diophantine equation to be 
considered is 

x 2 — dy 2 = 1. (31) 

That this equation has an infinite number of solutions was conjectured by 
Fermat in 1657 and eventually solved by Lagrange. It seems that Pell had 
nothing to do with it, the error in attaching his name to it being due to Euler. 
For the whole story, and much more, the interested reader should consult the 
book by Edwards [128]. See also Davenport [22], and A. Weil [240]. 
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The solution to (30) depends upon the following proposition of Dirichlet 
and is an application of the pigeon hole principle. 


Proposition 17.5.1. If ^ is irrational then there are infinitely many rational 
numbers x/y, (x, ^) = 1 such that \x/y — 《I < 1/y 2 . 


Proof. Partition the half-open interval [0, 1) by 


[ 0 , 1 ) 


0, 少 


2、 


u 


u 


n n 


n 


n 


，1 


If [a] denotes, as usual, the largest integer less than or equal to a then the 
fractional part of a is defined by a — [a]. It lies in a unique member of the 
partition. Consider the fractional parts of 0, 《， 2(，•.. ， At least two of these 
must lie in the same subinterval. In other words there exist j, k with j > k, 
0 < j, k < n such that 

\K - m - (ki - [/c^])| < - (32) 

n 

Put y = j — k ， x = m~] — [j^\ so that (32) becomes \x — yi, \ < 1/n. Here 
we may assume that (x, ^) = 1 since division by (x, y) only strengthens the 
inequality. But 0 < ^ < n implies that \x/y — £>\ < l/ny < l/y 2 . To obtain 
infinitely many solutions note that | x/y — | ^ 0 and choose an integer 
m > l/\x/y — €|. The above procedure gives the existence of integers x 1} 
such that \xjy x 一 《| < l/my 1 < \x/y — ^| and 0 < yi < m. This procedure 
leads to an infinite number of solutions. □ 


This proposition will be applied to show that \ x 2 — dy 2 \ assumes the same 
value infinitely often. 

Lemma 1. If d is a positive square-free integer then there is a constant M such 
that \x 2 — dy 2 \ < M has infinitely many integral solutions. 

Proof. Write x 2 — dy 2 = (x -I- y/dy)(x — ^/dy). By Proposition 17.5.1 
there exist infinitely many pairs of relatively prime integers (x, ^), ^ > 0 
satisfying | x — < l/y. It follows that 

+ Jdy \ <\x - ^fdy\ -h 2^/d\y\ < ^ + \fdy. 

Hence \x 2 — dy 2 \ < \ l/y + 2^fdy\\/y < lJ~A + 1 and the proof is com¬ 
plete. □ 

The main result of this section is as follows. 

Proposition 17.5.2. If d is a positive square-free integer then x 2 — dy 2 = 1 has 
infinitely many integral solutions. Furthermore there is a solution (x l9 such 

that every solution has the form 士 (〜 ， 〜+ \f^yn — ( x i + \/dyi) n , 
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Proof. By Lemma 1 there is an m e Z such that x 2 — dy 2 = m for infinitely 
many integral pairs (x, 3 ;), x > 0, 3 ; > 0. We may assume that the x com¬ 
ponents are distinct. Furthermore since there are only finitely many residue 
classes modulo |m| one can find (x l5 (x 2 , y 2 X 7 ^ x 2 such that x l = 
x 2 (|m|), yi=y 2 (\m\). Put oc = x x - P = x 2 - y 2 ^/d. If y = 

x — y-yjd let}/ = x + y-yjd denote the conjugate of y and N(y) = x 2 — dy 2 
denote the norm of y. Recall that N(<xP) = N(<x)N(P). A short calculation 

shows that ocf = A + B^/d where m\A, m\B. Thus aj?' = m(u + v^fd) for 
integers u and v. Taking norms of both sides given m 2 = m 2 (u 2 — v 2 d). Thus 

u 2 — v 2 d = 1. (33) 

It remains to see that v ^ 0. However if = 0 then w = 土 1 and aj?' = 士 m. 
Multiplying by j? gives am = ±mj?ora = + j?. But this implies that = x 2 . 
Thus Pell’s equation has a solution with xy ^ 0. 

To prove the second assertion let us say that a solution (x, y) is greater than 
a solution (w, 1 ;) if x + y^fd 〉 w + Now consider the smallest solution 
a with x > 0, 3 ; > 0. Such a solution clearly exists (why?) and is unique. It is 
called the fundamental solution. 

Consider any solution jS — u -vjd,u > 0, i; > 0 . We show that there is a 
positive integer n such that jS = a n . For otherwise chose n > 0 so that oc n < 

P < a n+1 . Then since a' = oT 1 ，1 < (oc')”j 8 < a. But if = A -h By/d, 
(A, B) is a solution to Pell’s equation and 1 < A B^/d < a. Now A + 
B^/d> 0 so ^ — B^/d = (4 + By/d)~ l > 0. Thus ^ > 0. Also A — B^Jd 
=(A B-^/dy 1 < 1 so Bjli > A — 1 > 0. Thus B > 0. This contradicts 
the choice of a. If )3 = <3 + by/d is a solution a > 0, b < 0 then = 

a — by/d = oc n by the above so j? = (x~ n . The cases a < 0, b > 0 and a < 0, 
b < 0 lead obviously to —oc n for neZ. The proof is now complete. □ 


For a solution to special cases of Pell’s equation using cyclotomy see 
Dirichlet [126] and Hartung [145]. 


§6 Sums of Two Squares 

If p is prime, p 三 1 (4) then by Proposition 8.3.1 the Diophantine equation 
x 2 y 2 = p has an integral solution which is essentially unique. There are 
many proofs of this result. It will be recalled that the proof in Chapter 8 made 
use of the ring of Gaussian integers. By further exploiting the arithmetic of this 
ring we will determine the number of representations of an arbitrary positive 
integer as the sum of two squares. The result is conveniently stated and in fact 
proved using the nontrivial Dirichlet character modulo 4 introduced in 
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Section 2 of Chapter 16. Recall that this character x is defined on Z by x(d) = 1 
ifd=l (4), i{d) = -lifrf = 3(4) and x{2k) = 0. 

Proposition 17.6.1. The number of integral solutions (x, y),x>0,y>0to the 
equation x 2 + 〆 =nb [卟淋 

In other words the number of representations of n as the sum of two 
nonnegative squares the first of which is positive is the excess of the number of 
divisors of the form 4n + 1 over the number of divisors of the form 4n -f 3. 
The total number of solutions (x, y), x,yeZis then easily seen to be 4 x(d)- 

Before proceeding to the proof we derive two corollaries. 

Corollary 1. The equation x 2 y 2 = n,n> 0 has an integral solution iff ord p n 
is even for every prime p = 3 (4). When that is the case the number of solutions 

is EL 三 1(4) G + ord p n). 

Proof. Since x { n ) is multiplicative it follows by Exercise 10, Chapter 2 that 
Z(^) is multiplicative. If p = 1 (4) then Yjd\p n Z(^) = n 1 while if 
/? = 3 (4) then [ dlpn y{d) is 0 or 1 according as n is odd or even. The result 
follows. □ 


Corollary 2. Let mbe a positive odd integer. The number of integral solutions 
(X ， y), x > 0, y > 0 to x 2 + y 2 = 2m is ^ d(m x{d). 

Proof. Since 2m = 2 (4), y is positive. On the other hand x(2d) = 0 for any 
divisor 2d of 2 m. □ 


We now proceed to the proof of the proposition. Consider the ring Z[i] of 
Gaussian integers. By Exercise 33, Chapter 1 the units are ± 1， Thus each 
nonzero a g Z[i] has a unique associate x iy,x > 0, y > 0. If N(x + iy)= 
x 2 -h y 2 is the norm mapping then clearly the number of solutions to x 2 -h y 2 
=n, x > 0 9 y >0 is the number of ideals (a) with N(oc) = n. Denote this 
number by a n . Recall further that every ideal (a) / 0 may be uniquely written 
(up to order) as (〜)“ • •. (n s ) ts where 〜 is irreducible. Finally according to 
Section 7 of Chapter 9 the irreducibles are given, up to a unit, by 1 + !•， 丌 
with nfi = p = l (4 )， and q, a rational prime, q = 3 (4). Also n and n are not 
associates. 

We now introduce the formal Dirichlet series Y,n=i This series is 
known as the zeta function of the ring Z[i]. We view this expression formally 
and shall not need any analytic properties of the associated function of a 
complex variable. Using the unique factorization of ideals in Z[i] proved in 
Section 4 of Chapter 1 one sees, using the same argument as in Exercise 25, 
Chapter 2, that 


z 


a 


n 




n 

⑻ 


i 


1/N(ny 


(34) 
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the product being over the set of (unassociated) irreducibles in Z[i]. The 
right-hand side of (34) becomes, by the above classification of irreducibles 


2 


- 1/2V p J?(4) 




- 1/pV q = 3(4) ' 1 


W S J 


(35) 


Next recall that 




z 


l/p s 


Noting that 1/(1 — q~ 2s ) = (1/(1 — q~ s ))(l/(l q~ s )) we see by rearrange¬ 

ment of terms that (35) becomes 


⑻ 11 ] ~~\jZs n Y 

p= 1 (4) 1 ~' l /P 丄 


4=3(4) 


W 


(36) 


This may be written as 


㈨ n t 

p 1 


- 7 - 77 ^. (37) 

x(p)/p 

Finally, using the fact that x is multiplicative we see that (37) may be written as 

lin) 


C( 5 )I 




(38) 


Recall that the second factor in (38) is the Dirichlet L-series introduced in 
Chapter 16, Section 2 in order to compute the density of primes p = 1 (4). 
We have shown 


I 


a 


n 






z ⑻、 




(39) 


Proposition 17.6.1 follows immediately from (39) for the coefficient in the 
right-hand side of (39) is, by the very definition of Dirichlet multiplication 
X(d)- This completes the proof. □ 


It should be noted that the rearrangement step in the above proof is 
purely formal and does not require any analytic properties of the infinite 
products. 


§7 Sums of Four Squares 

In 1621 Bachet stated without proof that every positive integer is the sum of 
four squares. This assertion was proved in 1770 by Lagrange. In 1834 Jacobi 
was able to give a remarkably simple formula for the total number of repre- 
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sentations of an integer as the sum of four squares from which the result of 
Lagrange follows immediately. 

We begin this section by giving the standard proof of Lagrange’s theorem. 
The technique is that of descent. Having established the result for primes the 
general result follows from a formal identity due to Euler expressing the 
fact that the norm of a quaternion is a multiplicative function. In the last, and 
somewhat lengthier part of this section we prove Jacobi’s theorem. The proof 
is based upon a letter (1856) from Dirichlet to Liouville ([122], pp. 201-208) 
simplifying Jacobi’s proof. See also Weil [237]. 

We begin with a diophantine problem modulo p. 

Lemma 1. If p is prime the congruence x 2 + y 2 1 = 0 (p) has a solution in 
integers x ， y. 

Proof. Denote by S the set of squares modulo p. Then S and { —1 — x|xeS} 
=S’ each have (p -h 1)/2 elements. Thus S and S’ are not disjoint and the 
result follows. 

By the above lemma there is an integer m such that mp = l + x 2 + y 2 has 
an integral solution and furthermore by adjusting the residues one may assume 
|x| < p/2, \y\ < p/2. Thus mp < 1 + p 2 /4 + p 2 /4 so that m < p. 

Lemma 2. Suppose for a prime p there is an integer m, l < m < p such that mp 
is the sum offour squares. Then there is ann,0 < n < m such that np is the sum 
of four squares. 

Proof. Write 

mp = xj xl xl -x\. (40) 

Let x t = y t (m) with — m/2 < y t < m/2. Then y\ y\ y\ ^ ( m ) so 

that there is an integer r > 0 such that 

rm = yj yj yl + y\. (41) 

Now rm < m 2 /4 + m 2 /4 + m 2 /4 + m 2 /4 = m 2 so that r < m. First of all 
r / 0 for otherwise y t = 0, i = 1， • • • ， 4 which would imply by (40) that m\p, 
a contradiction. Also r / m, since otherwise y ( = m/2; then xf = m 2 /4 (m 2 ) 
and (40) implies that mp = m 2 (m 2 ) or m\p. Multiplying (40) and (41) gives, 
by Exercise 28, the identity 

m 2 rp = (x 1 y 1 -f x 2 y 2 + + x^y 4 ) 2 + {x x y 2 - x 2 y x -h x 3 y 4 - x 4 y 3 ) 2 

+ (^1^3 - - x 2 y 4 + x 4 y 2 ) 2 + (x x y 4 - x 4 y l + x 2 y 3 - x 3 y 2 ) 2 

(42) 

Using x f = y t (m) one sees that each term on the right-hand side of (42) is 
divisible by m 2 . Cancelling m 2 shows that rp is the sum of four squares and the 
proof is complete. □ 
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Proposition 17.7.1. Any positive integer is the sum of four squares. 

Proof. This follows immediately from Lemmas 1 and 2 and Exercise 28. □ 

Let us now turn to the statement and proof of Jacobi’s theorem. The 
result that we will establish is the following. 

Proposition 17.7.2. Let nbea positive integer n = 4 (8). The number of integral 
solutions (x, y, z, w), x, y, z, w positive and odd to the equation 

x 2 y 2 z 2 w 2 = n (43) 

is the sum of the positive odd divisors of n. 

We leave to the Exercises the following corollary. 

Corollary. Let n be a positive integer. The number of integral solutions (x, y, 
z, w) to x 2 y 2 z 2 w 2 = n is 8 djfn odd and 24 d, d odd, if n is 

even. 

The proof of the proposition is divided into several lemmas. Let N denote 
the number of integral solutions (x, y, z, w) to (43) with x, y 9 z, w positive and 
odd. Since n = 4 ( 8 ) we may write n = 2m, m = 2 (4). 

Lemma 3. N is the number of solutions (x, y, z, u, v) to the system of Diophantine 
equations 

x 2 y 2 = 2m, 

z 2 + w 2 = 2v, (44) 

u v = m 9 

with x, y, z, u, v odd and positive. 

Proof. This is left as a simple exercise. □ 

As in Section 5 let x denote the nontrivial Dirichlet character modulo 4. 

Lemma 4. iV = ^ = Yj (— l) (de _ 1)/2 = [ (—l) (d_e)/2 the sum over all 

solutions (d, e ， t ， s) in positive odd integers to ds et = m. 

Proof. By Lemma 3 and Corollary 2 of Proposition 17.6.1 we see easily that 

N = Z /Z x(d)x(e )\. (45) 

U,v 1 d\u I 

u + v= m \^e>jy J 

Write w = ds，v = et so that the terms in (45) are in one-to-one correspondence 
with solutions (d, e, t, s), d, e, t, s positive, odd and satisfying ds + et = m 
This proves the first equality in the lemma. The second follows from the 
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definition of / and the fact that (d — 1)/2 + (e — 1)/2 = (de — 1)/2 ( 2 ) when 
d and e are odd. 口 

Consider now the terms in [ xi^ e \ the sum being as in Lemma 3, for which 
d = e. For each odd d\m,s t = m/d has m/2d solutions in positive odd, 5, t. 
The total number of solutions is therefore m/2d = Each solution 

of ds + et = m, d = e contributes xid 2 ) = 1 to iV by Lemma 4. The proof of 
Proposition 17.7.2 will follow if one shows ^ x(^e) = 0 the sum as in Lemma 
4 and d 关 e. Pairing (d 9 e, t, s) with (e, d, s, t) shows that is enough to prove 
Z x(de) = 0, d > e. 

Denote by S the set of all (d, e, t, s), d > e，ds + et = m ， d ， e ， t，s positive 
and odd. The idea behind the remainder of the proof is to construct a bi- 
jection of 5 that sends Y,s x(de) to its negative. This, of course, will imply that 
Zs X(de) = 0. 

For a positive integer n put 

/n + 1 n 2\ 


and define (d f , e\ t\ s r ) by 


L n 





A ' 1 




(46) 


Since 


—n — l' 


one checks quickly that 


4m 


Taking determinants one sees that 


ds + et = d f s f -h e f t\ 


(47) 


Thus for each n we have a mapping from Z 4 to Z 4 , which we denote by ij/ n . 


Lemma 5. Given (d, e, t, s)eS there is a unique neZ + such that ij/ n (d ， e, t, s) e S. 

Proof. One sees immediately using (46) that d\ e\ t\ s r are odd,cT > e\d! > 0, 
〆 > 0. Furthermore the conditions s f > 0, t r > 0 are equivalent to, by (46), 
e/(d — e) — 1 < n < e/(d — e). But d — e is positive and even and e is odd 
from which it follows that this inequality is satisfied for a unique n ^ 0. This 
concludes the proof. □ 



284 


17 Diophantine Equations 


Denote the mapping from S to S defined by Lemma 5 by 


Lemma 6. is a bijection. 

Proof. We will show that ❿ 2 is the identity map. For if (d, e, t, s) e S then 


<I> 2 (d, e, t, s) 







(48) 


where the asterisk denotes transpose. Here k and n are defined by Lemma 5. 
But the integer k is uniquely defined by the condition that the right-hand side 
of (48) is in S and that is true if k = n. Thus e ， t ， s) = (d ， e ， t, s) and the 
proof of the lemma is complete. □ 


In order to complete the proof observe from (46) that d f — e f = s t. But 
X(de) = (— l) (d_e)/2 . Since ds et = 2 (4) one sees that (d — e)/! is even 
iff (s -h t)/2 is odd. Thus x{de) = — x(dV). Finally M = lide )= 
~YjS Z(^) = —M from which it follows that M = 0 and the proof is 
complete. □ 


§8 The Fermat Equation : Exponent 3 

The Fermat equation 

x p + y p = z p (49) 

has been discussed in special cases in Sections 2 and 4 and in Chapter 14 
(Theorem 5). In this section using the arithmetic of Z[co] where co 3 = 1, 
co / 1 we give a complete solution to the equation 

x 3 y 3 = Z 3 . (50) 

That this equation has no integral solution, xyz # 0, was first proved 
essentially by Euler. See, however, G. Bergmann [91]. 

Instead of (50) we shall study the more general equation 

x 3 + 夕 3 = mz 3 , (51) 

where w is a fixed unit in Z[co] and prove the following result. 

Proposition 17.8.1. The equation x 3 + y 3 = uz 3 , where u a fixed unit in Z[co] 
has no integral solution (x, y, z), xyz / 0 where x, y 9 zg Z[co]. 

This implies, of course, that a nonzero cube in Z is not the sum of two 
nonzero cubes in Z. 
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Proposition 17.8.1 will be proved in a sequence of lemmas. First we recall 
the basic facts concerning the arithmetic of Z[co], proved in Chapter 9. The 
ring Z [co] is a principal ideal ring with units 土 1 ， ±o, ±o 2 . Write X = 1 — oj 
and recall that (X) 2 = (3 )， and that X is irreducible. Each element a e Z[co] is 
congruent modulo X to + 1 ， — 1 or 0. This fact will be used repeatedly in the 
following. If a = uX n p where w is a unit and then we write n = ord A a. 

First of all we establish the weaker result, the so-called first case, that (51) 
has no solution with X\xyz. 

Lemma 1. The equation x 3 y 3 = uz 3 , u a unit in Z[co] has no solution with 
x, y, ze Z[co], X%xyz. 

Proof. Note that since X is irreducible the condition X\xyz is equivalent to 
义冰 x ， 又卞 _y ， 乂氺 z. If x e Z[co], x = 1 (A) then x 3 = 1 (A 4 ). For if x = 1 - 1 - Ar 
then 

x 3 — 1 = (x — l)(x — co)(x — co 2 ) 

= — co + Af)(l 一 co^ H - At) 

=At(A + At)((l + co)A H - At) 

=X 3 t(l -h t)(t — co 2 ). 

Since co 2 = 1 (A) and t is congruent modulo 又 to +1 ， 一 1 or 0 the congruence 
follows. 

Now assume a solution to (51) exists with X xyz and reduce modulo X. 
Then 

士 1 土 1 e 土 m (A 4 ). (52) 

But it is easy to check that (52) is impossible for any choice of signs and unit. 
This completes the proof. □ 

We pass now to the more difficult situation in which we assume a solution 
exists with X \ z and (x, y) = 1 • Thus 又冰 xy. Under these conditions the follow¬ 
ing lemma shows that in fact X 1 1 z. 

Lemma 2. If x 3 + y 3 = uz 3 for x, y, z g Z[co], Ajz then X 2 \z. 

Proof. Reduction of (51) modulo A 4 gives 

士 1 ± 1 三 wz 3 (A 4 ). 

If 0 = uz 3 (A 4 ) then 3 ord A z > 4 so that ord A z > 2. If ±2 = uz 3 (A 4 ) then 
义 12 which is not true. □ 

The following lemma constitutes the “descent” step. 

Lemma 3. If x 3 -h y 3 = uz 3 , (x, y) = 1, ord A z >2 then there exist 

u 1 , x 1? y u z x e Z[co], u x a unit, X iv ord A = ord 又 z — 1 and such that 

x? + W = u x z\. 
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Proof. Recall that if ord A a ^ ord A j? then ord 又 (a 土 j?) = min(ord A a, 
ord A p). Next 

(x -f y)(x -h coy)(x -h co 2 y) = uz 3 . (53) 

Since ord A (wz 3 ) > 6 at least one factor on the left-hand side of (53) is divisible 
by A 2 . Replacing if necessary y by coy or o 2 y we may assume that ord A (x -h y) 
> 2. Since ord 又 (1 — (o)y = ord A = 1 we see that 

ord A (x + coy) = ord A (x + y — (1 - oS)y) 

= 1 . 

Similarly ord A (x -h co 2 y) = 1. Thus 

ord A (x y) = 3 ord A z — 2 . 

If n is an irreducible (n) / ( 又 ) then n cannot divide x y and x -h coy. For 
otherwise n\(l — oS)y = Ay, so that 7 r | 7 r|x. It follows that (x -h x - 1 - coy) 
=(A). Similarly the other pairs of factors of (53) have greatest common 
divisor A. Since unique factorization in Z[co] holds one can write 

x -h y = u t a 3 A\ t = 3 ord A z — 2, A^a, 

x coy = u 2 p 3 X, (54) 

x -f co 2 y = w 3 y 3 A, A 

In (54) u u u 2 , u 3 are units and (a, p) = (a, y) = (fi 9 y) = 1. Multiplying the 
second equation in (54) by co, the third by co 2 and adding one obtains 

0 = u t a 3 A l + cou 2 P 3 ^ + a) 2 M 3 y 3 A. (55) 

Cancelling A(!!) gives 

0 = M 1 a 3 A 3(ordz_1) + cou 2 P 3 + co 2 u 3 y 3 . (56) 

Finally putting aA ordz_ 1 = z 1? jS = x 1? y = y u (56) becomes, with units 

£ l» £ 2 

+ ^1^1 = £ 2 ^ 1 * (57) 

Reducing (57) modulo A 2 and noting that ord 又 (zf) > 2 we find 

土 1 士 q 三 0 (A 2 ). (58) 

An examination of cases leads immediately to 以 = 土 1. Thus, replacing if 
necessary by we arrive at a new relation 

xl + yl = szf, 

with A 卞 ord^Zi = ord A z — 1, e a unit. This completes the proof. □ 

To prove Proposition 17.8.1 we proceed as follows. If xyz we invoke 
Lemma 1. If A ^ but X\z then Lemmas 2 and 3 lead to a contradiction. 
Finally, if A|x but X ^ yz, then ±1 三 w (A 3 ) which implies ±1 = w. But 
then (土 z ) 3 + ( — y) 3 = x 3 and we are in a situation already disposed of. 
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§9 Cubic Curves with Infinitely Many Rational 
Points 

In the previous section it was shown that the equation x 3 y 3 = z 3 has no 
solution in integers x, y, z with xyz ^ 0. Division by z 3 shows that the cubic 
curve x 3 -f .v 3 = 1 has no rational points (x, y\ xy ^ 0. Similarly from the 
fact established in Section 2 that x 4 + y 4 = z 2 has no integral solution with 
xyz 0 one concludes that the curve defined by y 2 = x 4 1 has ( 0 , ± 1 ) as 
its only rational points (see Exercise 31). 

In this section we give examples of cubic curves with an infinite number of 
rational points. The proof is based upon the simple observation that the 
tangent line to a cubic curve at a rational point intersects the curve in a unique, 
not necessarily new, point which is again rational. We say that an integer a 
is cube-free if ord p a < 2 for all primes p that is, no cube / 1 , — 1 divides a. 


Proposition 17.9.1. If a > 2 is a cube-free integer such that the cubic curve with 
equation 

x 3 y 3 = a (59) 

has a rational point then it has infinitely many rational points. 


Proof. Let (a, jS) be a rational point on (59). If a = x x /z l9 p — yi/z[, (x l9 z 1 ) 
= (^ 1? zi) = 1 with x x , y l9 z 1? z[ integers then it is easy to see that z x — z[. 
Since a > 2 is cube-free x x y x / 0 and x x / y x . The tangent line to (59) at 
(a, P) is a 2 x + p 2 y = a. Solving for y and substituting in (59) gives 


. (a — a 2 x \ 3 

X + (~~^ — ) - = 0 . (60) 

The left-hand side of (60) is a cubic polynomial with a as a double root (at 
least). If the third root is y then since the sum of the roots is the negative of 
the coefficient of x 2 , we obtain after a simple calculation, 


Thus 



_ q(a 3 + 2jg 3 ) 
a 3 — p 3 

= (xf -h 2y\) 

(x 3 ! - yf) 


The corresponding value for y = (a — a 2 x)//S 2 is 


_ ~yi (2xf + y\) 
P ~ z x (xf - yl) 


(61) 


(62) 


( 63 ) 
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and by (60) (y, p) is a rational point on the cubic. The reader may verify 
directly, of course, that (y, p) satisfies y 3 -f p 3 = a. It remains to show that 
(y, p) is distinct from (a, jS) and moreover that one obtains by this process an 
infinite number of points on the curve. Define the integer Aby A> 0 and 

Ax 2 = x^xl -h 2y\\ 

Ay 2 = ~yi(2xf -h yl\ (64) 

^2 = Zi(xl - yl\ 

with (x 2 , y 2 , z 2 ) = 1. Thus A is the greatest common divisor of the integers on 
the right-hand side of (64). Clearly one has 

x 3 2 y 3 2 = azl, z 2 / 0. (65) 

Since a is cube-free and (x 2 ，夕 2 ， z 2 ) = 1 we see that (x 2 , y 2 ) = (x 2 , z 2 )= 
(y 2 , z 2 ) = 1. We claim that A is equal to 1 or 3. For if pis prime and p | A then 
it follows without difficulty from (64) that p ^ x l y 1 z l . Thus p divides each of 
the second factors on the right-hand side of (64) and consequently p\3yl ， 
Thus p is 1 or 3. Notice, also, that (A, z x ) = 1 implies A\x\ — y\. 

The proof will be completed by showing that |z 2 | > | 〜| • To this end one 
has 


|z 2 | = 

1 ^ I 

= + yl\> ( 66 ) 

One sees, 4|xf + yj \ = |(2x x -f y^ 2 -h 3yj\ > 4 and consequently 

one has the inequality |z 2 | > \z l \\x 1 — y x \/A. If X = 1 then ( 66 ) shows 
that |z 2 | > \z l \. On the other hand, if A = 3, then since A\xl — y\ one has 
xl = .yf (3) which implies that = y x (3) and once again ( 66 ) implies 
that I z 2 1 > \z x |. Continuing in this manner one obtains a succession of points 
(xjz n , yjz n \ x n y n / 0, (x n9 z„) = (y n , z n ) = 1 and |z w | > Iz^J, and the 
proof is complete. □ 


§10 The Equation y 2 = x 3 + k 

The Diophantine equation 

y 2 = x 3 + k (67) 

has been studied extensively since its consideration in the seventeenth century 
by Fermat and Bachet in the special case k = —2. The integral values of k for 
which (67) has a rational solution have not been determined thus far. It was 
asserted, though not demonstrated, by Bachet and others that given a rational 



§10 The Equation y 2 = x 3 k 


289 


solution (x, y\ xy ^ 0 the tangent method, used in Section 9, produces an 
infinite number of solutions. Thus, in modern language, the elliptic curve (67) 
then has positive rank (see Chapter 18). This result was established with 
several exceptional cases by Fueter in 1930. 

In 1966 Mordell gave a remarkably short proof of Fueter’s result [191]. 
More precisely he proved 

Proposition 17.10.1. Ify 2 = x 3 + /c, k a sixth power-free integer，has a rational 
solution (x, y\ xy ^ 0 then there are an infinite number of rational solutions 
provided k 婪 1 ， —432. 


It is shown in the Exercises that the case k = —432 is equivalent to 
Fermat’s equation x 3 + y 3 = 1， which by the main result of Section 8 can 
easily be shown to have only the rational solutions (1 ， 0) ，（ 0,1). We will not 
give the details to Proposition 17.10.1，but rather refer the interested reader 
to Mordell’s paper. The proof consists in showing that the tangent method 
used in the preceding section leads to an infinite number of solutions. 

Thus y 2 =x 3 — 2 has an infinite number of rational points since it has one, 
namely (3, 5). However, we point out that there are only a finite number of 
integral solutions. This is a difficult theorem for general k but in the case 
k = —2 a very short proof can be given using Exercise 36 of Chapter 1. For 


(y + \/ 一 2)Cv 一 


2 ) 


x 


3 


( 68 ) 


If n is an irreducible in Z[^/—2] dividing both factors on the left-hand side of 
(68) then n\2^/—2. Thus (k) = (^/ — 2), and y/—2\x which implies, taking 
norms, that 2\x. But this implies that y 2 = 2 (4) which is impossible. Since 

Zf^/—2] is a unique factorization ring with units ±1, (68) shows that 


y + 2 = (a + b^/ — 2) 3 . 

Thus 

y = a 3 — 6ab 2 , (69) 

1 = 3a 2 b - 2b 3 

= b(3a 2 - 2b 2 ), (70) 


Hence b ~ \ and one obtains as the only solutions (3, ±5). 

If d is a positive square free integer then one can find the integer solutions 
to y 2 — x 3 ~ d in certain cases using the arithmetic of the imaginary quad¬ 
ratic field —d). As in the case of Fermat’s Last Theorem (see Section 11) 
it is necessary in this approach to impose a divisibility condition on the class 

number h of namely we require that 3\h, If, furthermore, we restrict 

d by assuming d ^ + 1， + 3 and —d = 2 or 3 (4) then by Chapter 13 the ring 

of integers of Q(^/—d) is 1\_yJ—d'] and ± 1 are the only units. Under these 
conditions assume that (x, y) is an integral solution to y 2 = x 3 — d. Then 
(Exercise 32) x is odd and (x, d) = 1. Now 


x 3 = (y + - 
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If P c Z \_\J — rf] is a prime ideal containing y + -J—d then 2^/ — d e P 
and xe P. Thus N(P )| Ad and N(P)\x 2 which is impossible. It follows that 

(y + —d) and (y — y/—d) have no common ideal factors. Since 2[^/— d] 

is a Dedekind ring we have 

(y + V^)= 奴 3 

for some ideal 91. Since 3^/1 the ideal class group of Z[^/ —rf] has no element 
of order 3 and therefore 91 is principal Thus, since 土 1 are the only units one 
has 


This implies 


y + \J—d = ±(a + byj — d) 


1 = 士 b(3a 2 — db 2 )， 
y = 土 a(a 2 — 3db 2 \ 


3 


(71) 


(72) 


from which one derives easily b = ±1 and 

d = 3a 2 土 1. 


(73) 


Thus y 2 = x 3 — d has a solution precisely when d lies in one of the quadratic 
progressions 3a 2 士 1. When this is so one finds easily the value of x to be 
a 2 + d. Thus we have the following proposition. 

Proposition 17.10.2. Let d > 1, square-free and d = 2or l (4). Assume that the 
class number of 0(^—d) is not divisible by 3. Then y 2 = x 3 — d has an integral 
solution iff d is of the form 3t 2 ± 1. The solutions are then (t 2 + d, 土 t(t 2 — 3d)). 

For a discussion of the real quadratic case, see W. Adams and L. Goldstein 
[84], Chapter 10, and Mordell [189], Chapter 26. 


§11 The First Case of Fermat’s Conjecture for 
Regular Exponent 

In this last section we use results from Chapters 12 and 13 on the arithmetic of 
cyclotomic number fields to prove a special case of Fermat’s conjecture. If C 
denotes an Ith root of unity different from 1, where / is an odd prime then Q(Q 
is an algebraic number field of degree l — 1 whose ring of integers is, by Pro¬ 
position 13.2.10, Z[Q. Thus by Theorem 2, Chapter 12 every nonzero ideal in 
Z[C] can be factored uniquely as a product of powers of distinct prime ideals. 
Recall that / is called regular if l^h where h denotes the class number of Q(C). 
Thus if 91 is an ideal such that is principal then 21 itself is principal, a fact of 
central importance in the following. 

We need one additional result concerning the arithmetic of Z[C]. 

Lemma 1 . If u is a unit in Z[C] then〔 s u is real for some rational integer s. 
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Proof. Observe first of all that complex conjugation is an automorphism of 
Q(0 since I = l l ~ x . Thus if w is a unit then w is a unit and t = u/u e Z[Q. 
Furthermore if p is any automorphism of Q(Q then p(x) — p(u)/p(u)= 
p(u)/p(u) so that |p(r)| = 1. By Lemmas 1 and 2, Section 5, Chapter 14, 
t = 土 r for some integer If A = 1 — C then C j = 1 ( 义 ) for all j, so that 
writing u = a 0 «/ + ••• + a 卜卜 2 and using the fact that p(0 = C k for 
some k we see that u = p(u) (A). In particular u = u{X). If t = ~C then u = 
—Cu so that u 三一 u(X). Thus 2u = 0 (A) which is impossible. Therefore 
u = = C~ 2s u where —2s = t (/). Finally C s u = showing that C s w is real. 

□ 

The main result of this section is the following. 

Proposition 17.11.1. If l is a regular prime then the diophantine equation 

x l y l = z l (74) 

has no solution in rational integers x, y, z with I ^ xyz. 

The proof of this proposition will be presented in several lemmas. We 
begin by factoring the left-hand side of (74) 

-f = (x + y)(x 十 00 … (x + C 卜 V). (75) 

Recall that two ideals 91 and © are relatively prime in Z[(] if 91 + S = Z[C]. 
When this is the case 91 and © have no common prime ideal divisors. Assume 
for the remainder of this section that (74) has a solution in integers x, y, z, 
/ 氺 xyz and that l 卞 h. Suppose, as we may, that x, y, z are pairwise relatively 
prime. 

Lemma 2. The ideals (x + Cy) and (x + C j y) are relatively prime if i • j (/)• 
This lemma has already been proven in Section 6, Chapter 14. 

Lemma 3. There exist u, P e Z[Q, u is a real unit such that x + Cy = 〔 s up ， 
where seZ and p 三 n (l)for some neZ. 

Proof. Using Lemma 2, Corollary in Section 6, Chapter 14, and the fact 
that the right-hand side of (74) is an /th power we see that (jc + (,y) = SI 7 for 
some ideal 级 . Since l\h it follows that % is principal. Thus x — ea l 
where a E Z[^] and e is a unit. The result follows from Lemma 1 and the 
observation that if a = S{Io ciiV then a 1 = 2[Io (/)• □ 

Taking conjugates one has x + C V = so that C _s (x + 00 — 

C s (x + C~ l y) — U (P — P)- However P = p = n(l) and so we have shown that 
r s (x + 00 — C S (X + r V) e /Z[C]. We state this as 

Lemma 4. x + Cy ~ C 2s ^ — C 2s ~ l y ^ 泛 K]. 
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In case (a), Lemma 4 implies —y + C 2 y^ /Z[Q so that l\y. In case (c), we 
find x — ^ 2 x e /Z[Q so that Z|x, a contradiction. Finally in case (b) we find 
(x — y) + (y — x)C e /Z[Q. Thus x = y (/). Write Fermat’s equation as 
x l — z) 1 = ( — y) 1 . Then, arguing as earlier, we obtain Lemma 4 with a 
possibly different s. However cases (a) and (c) lead to contradictions and case 
(b) gives, as above, x = —z (/). But 0 = x l -\-y l — z l = x-\-y — z (/). Thus 
3x = 0 (/) which implies l\x a contradiction! This completes the proof of the 
first case of Fermat’s Last Theorem for regular exponent. □ 

The above proof is essentially that given in Borevich and Shafarevich [9]. 


§12 Diophantine Equations and Diophantine 
Approximation 

In this final section we give a brief discussion of the relationship between 
diophantine equations and the approximation of algebraic numbers by 
rational numbers. The technqiues required to prove the results mentioned 
below are different from those developed in the preceding chapters. Here we 
can only give an indication of the results and refer the interested reader to the 
literature. 

If a is an irrational number then by Proposition 17.5.1 there are infinitely 
many rational numbers p/q such that 

P 1 
CL — 一 ~ 

q q 

It is natural to ask whether the exponent 2 in this inequality can be increased. 
A deep result of Roth in 1955 [118], for which he was awarded the Fields 
Medal in 1958, asserts that if a is algebraic of degree >2 then for each fixed 
s > 0 there are at most finitely many rational numbers p/q, q > 0 with 


By Proposition 6.4.1. 1 ， C ， C 2 , ••• ， C l _ 2 are linearly independent over Q. 
Furthermore we may assume / > 3 (by Section 8) and 0 < s < / — 1. The 
proof of Proposition 17.11.1 will be completed by deriving a contradiction 
from the relation of Lemma 4. By the above comment we need only to examine 
the cases when two of the powers of C are the same. Thus we must examine the 
cases 


- 1 1 
=1 - 
s s s 
2 2 2 

a)b)c) 

/tv /V /V 
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It follows that there is a constant c > 0 such that for all rationals p/q one has 


The theorem of Roth was preceded by deep results of A. Thue (1909) and 
C. L. Siegel (1921) each of which improved an elementary estimate of 
J. Liouville (1844). This simple result is the following. 


Proposition 17.12.1. If cl is a real algebraic number of degree n 9 n > 2 then there 
is a constant c > 0 such that for any rational number p/q, q > 0 


Proof. It is clearly enough to assume |a — p/q\ < 1. By the mean value 
theorem \f(p/q)\ = I/(a) — / (p/q )| < |a — p/q | A where / (x) e Z[x] is irre¬ 
ducible, /(a) = 0 ， and A = sup| f\x) |, |x — a| < 1. But since a is not 
rational f(p/q) ^ 0 and \f(p/q)\ > l/q n . This completes the proof. □ 


The Thue and Siegel results replaced n by n/2 + 1 and 2yjn respectively. 
Roth’s result is, in a certain sense, the best possible, by Dirichlet’s theorem 
(Proposition 17.5.1). However we shall see that any improvement in the 
Liouville estimate, i.e., any lowering of the exponent n (but greater than 2!) 
has profound consequences in the study of certain diophantine equations. In 
fact, let a n x n + + • • • + a 0 be a polynomial with integral coef¬ 

ficients, irreducible over Q and of degree at least 3. For a nonzero integer m 
consider the diophantine equation 


a n x n + 一 〆 一 V + … + a 0 / = m. 


(77) 


We will show that if one has an inequality of the form 


P 

a - 

q 


> 


c 


(f 


n — E 


n — s > 2, 


(78) 


valid for some 0 < s < n, and all rational numbers p/q then (77) has at most a 
finite number of integral solutions. This remarkable result follows quite 
easily from (78). For write (77) in the form 


( 


x 

y 


a 



(i)U^ - 

y 


a (2) ) 


x 

—— ot 

y 


⑻ 


m 




Put A = min|a (i) — a 0) |, i ^ j. Then if (x, y) is an integral solution y # 0 
clearly at most one a ⑺ satisfies | x/y — a 0) | < Ajl. For such an a 0) apply (78) 
and for the remaining terms use | x/y — a (l) | > A/2. Then 


m T 

> i7P 
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for a suitable T depending only on a ⑴， … ， a ⑻. Thus 

m > T\y\\ £ > 0, 

from which it follows that \y\ is bounded. But for any y the number of x 
satisfying (77) is bounded and we are through. Thus while x 2 — 2y 2 = 1 has 
infinitely many integral solutions, x 3 — 2y 3 = 1 has only finitely many 
integral solutions. 

Among the texts treating in detail this vast area of number theory we 
recommend K. B. Stolarsky [225], A. Baker [89], and W. M Schmidt [217]. 

Notes 

The literature on diophantine equations is vast. We will cite only a few articles 
and essays that have a relationship with the equations discussed in this 
chapter. For a good general survey article we recommend W. J. LeVeque, 
“A Brief Survey of Diophantine Equations” [180], as well as the early essay 
by G. H. Hardy [39]. The supplement of Heath’s edition of Diophantus [146], 
provides a technical study of the equations considered by Fermat and Euler in 
the seventeenth and eighteenth centuries. See also the scholarly work by J. E. 
Hoffman [152], where a detailed analysis is made of the results of Fermat and 
Euler and their relationship to the tangent method for finding rational points 
on cubic curves described in Sections 9 and 10. Relationships between this 
process and the corresponding diophantine equations modulo p will be 
indicated in the following chapter. 

Excellent chapters on diophantine problems can be found in various 
introductory texts on number theory. We mention in particular Adams and 
Goldstein [84], Hardy and Wright [40], Uspensky and Heaslet [230] 
Davenport [22], and Niven and Zuckerman [61]. 

For a broad perspective on the formative period of this branch of mathe¬ 
matics and number theory in general, see the informal lecture by A. Weil 
| ； 235]. 

An extensive coverage of diophantine equations by a modern master is 
given in the text by L. J. Mordell, [189]. A much more sophisticated and 
abstract approach is taken by S. Lang in his book Diophantine Geometry 
[170]. For a spirited discussion of the relative merits of these books the 
interested reader should consult the reviews of Lang’s book by Mordell, 
[190] and the subsequent review of Mordell’s book by Lang [172]. See also 
the advanced surveys by S. Lang [53], [173]. 

Exercises 

1. Show that 165x 2 — 21y 2 = 19 has no integral solution. 

2. Find the integral solutions to y 2 + 31 = x 3 . 

3. Show that x 3 y 3 = 3z 3 has no solution x, y, ze Z[co], z 0. 
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4. (In memoriam Ramanujan) Show that 1729 is the smallest positive integer expressible 
as the sum of two different integral cubes in two ways. 

5. Which of the following have nontrivial solutions? 

(a) 3x 2 — 5y 2 + 7z 2 = 0. 

(b) lx 2 + lly 2 - I9z 2 = 0. 

(c) 8x 2 — 5y 2 — 3z 2 — 0. 

(d) llx 2 — 3y 2 — 41z 2 = 0. 

6. Find the fundamental solutions to x 2 — 3y 2 = 1, x 2 - 6/ = 1, x 2 - 624y 2 = 1. 

7. Reduce the problem of the integral solutions of 3x 2 + 1 = 4y 3 to Proposition 17.8.1 
as follows: 

(a) Put t = (3x — 1)/2; t ^ 1, —2, so that f 2 + r + 1 = 3y 3 , y 9 ^ 0. 

(b) (r + 2) 3 + (1 - 0 3 = (3y) 3 . 

8. Find the integral solutions to y 2 = x 3 — 4. 

9. Find four rational points on x 3 + j； 3 = 9 using the method of Proposition 17.9.1. 

10. Find the integral solutions to y 2 = x 3 — 1. 

11. Show that if x 2 — dy 2 = — 1 has an integral solution then so does x 2 — dy 2 = 1. 

12. List the integral solutions of x 2 + y 2 -h z 2 + w 2 = 15 and check with Proposition 

17.7.2. ’ 

13. Let t be an integral cube. Show that y 2 — x 2 — t has an integral solution. 

14. Show that if x 2 — dy 2 = n，d > 0 square-free has an integral solution xy ^ 0 it has 
infinitely many. 

15. Let a 4- by/p be the fundamental solution to x 2 — py 2 = 1, where p is prime p = 1 
(4). The following steps show that x 2 — py 2 = —1 has an integral solution x, y, 
X' y ^ 0 . 

(a) a is odd. 

(b) a ± 1 = 2m 2 , aT I = 2pv 2 , 2uv = b. 

(c) u 2 — pv 2 = ± 1. 

(d) In (c) the negative sign holds. 

The following seven exercises establish the corollary to Proposition 17.7.2. Let A(n) de¬ 
note the number of integral solutions to x\ + x\ x\ x\ — n. See [52]. 

16. Show that A(4n) = A(2n). 

17. If n is odd show that 16 d + A(n) = A(4n). 

18. If n is odd let S be the number of solutions to x\ x\ = 2n with = 

x 2 = \ (2) and x 3 = x 4 = 0 (2). Show that the number of elements of S is ^A(2n). 

19. If n = 1 (4) and S is as in Exercise 18 show that the number of elements in S is iA(n). 
Conclude that A(2n) = 3A(n). 

20. If n = 3 (4) then A(2n) — 3A(n). 

21. If n is odd show that A(n) = 8 Y,d\n ^ ^(2n) = 24 i 

22. If n is even n — 2 s m, 5 > 1, m odd show that A(n) = 24 d. 
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23. The discriminant of t 3 - pt q is — (4p 3 + 21 q 2 ). Reduce the problem of determin¬ 
ing the cubics with discriminant 1 and /?, q rational to Fermat’s equation x 3 y 3 = l 

by putting x = (3q + 1)/(3 分 —1)，= 2p/(3q 一 1)，q 參 士 . Show that the resulting 
cubics are r 3 — £ ± 

24. Show that Proposition 17.3.1 implies Proposition 17.3.2. 

25. Show that if 6 is a positive integer and — 1 is a square modulo b then x 2 + y 2 = b has 
an integral solution. 

26. If (n, m) = 1 show that a Rm, a Rn implies a R mn. 

27. Justify the rearrangement steps in Proposition 17.6.1. 

28. Let A be the set of complex matrices of the form 

Show that Euler’s identity, which states that {x\ x\ x\ xl)(yj + yl + 
y\ + yl) equals the right-hand side of Equation (42), is equivalent to dct(MN) = 
(det M)(det N) for M,NeA. 

29. The following argument shows that Proposition 17.8.1 implies that y 2 = x 3 — 432 
has (12, ±36) as its only rational solutions. Fill in the details. Assume a solution 
(x, y) exists distinct from (士 36, 12), x > 0. 

(a) Write y/36 = a/c, x/12 = b/c, with a = c = 0 (2). 

(b) Put r = (a + c)/2, s — (c — a)/2, t = b > 0. 

(c) Show that r 3 s 3 — t 3 , rst ^ 0. 

30. The converse to Exercise 29 is also true; Show that if x 3 + j； 3 = z 3 , xyz ^ 0, 
x, y, zeZ then putting r = 36(x — y)/{x + y)，s = 12z-(x 4 - }；) leads to r 2 = s 3 
— 432. 

31. Using the fact that x 4 y 4 = z 2 has no integral solution xyz ^ 0 show that (0, ± 1) 
are the only rational solutions to y 2 = x 4 + 1. 

32. Let ^ be a square-free integer d 三 1 or 2 modulo 4. Show that if x and y are integers 
such that y 2 = x 3 — d then (x, 2d) = 1. 



Chapter 18 

Elliptic Curves 


Many of the themes studied throughout this book come 
together in the arithmetic theory of elliptic curves. This is a 
branch of number theory whose roots go back a long way, 
but which is, nevertheless, the subject of intense investiga¬ 
tion at the present time. 

In this chapter we will give a brief overview of some of 
the relevant definitions ， problems, and conjectures about 
elliptic curves. In particular, it is our purpose to describe a 
subtle and influential conjecture due to B. J. Birch and 
H. P. F. Swinnerton-Dyer. For the most part we will 
omit proofs and be content to give a rough guide to the 
ideas involved. For curves of the form y 2 = x 3 + D and 
y 2 = x 3 — Dx we will give a more detailed analysis and 
show how the global zeta functions of these curves are 
related to Hecke L-functions. This will yield a special case 
of an important theorem due to M. Deuring. Our expo¬ 
sition is based on the seminal papers of H. Davenport and 
H. Hasse [23] and A. Weil [81]. 

The techniques that are currently being used to study 
elliptic curves are among the most sophisticated in all of 
mathematics. We hope that the elementary approach of 
this chapter will inspire the reader to further study in this 
fascinating and lively branch of number theory. There is 
much to be learned and much work yet to be done. 


§1 Generalities 

We begin with some general observations about curves in projective space. 
For the terminology the reader may wish to review Chapter 10, Section 1. 

Let X be a field and F(x 0 , x u x 2 )e X[x 0j x u x 2 ] a homogeneous poly¬ 
nomial of degree d. A very general problem is to determine whether 
F(x 0 , x u x 2 ) = 0 has a solution in P 2 (K). 

It is useful to introduce geometric terminology. The equation 

F(x 0 ,x u x 2 ) = 0 

is said to define a curve of degree d over K. The field K is called a field of 
definition. If L is a field containing K one can consider the zeros of F in 
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P 2 (L). In our previous terminology this is the hypersurface H F (L). A hyper¬ 
surface in projective 2-space is appropriately called a curve. Notice F sets 
up a map from fields containing K to sets; L H F (L). 

A point a e H F (L) is said to be a nonsingular point if it is not a simul¬ 
taneous solution to the equations 


dF 

dx 0 


dF 

dx x 


0, 


dF 

dx 2 



In this case, the line 



dF 

dx 0 


(a)x 0 + 


dF 

dx x 


⑷ A + 


dF 

dx 2 


(a)x 2 


is called the tangent line to F at a. The curve F(x 0 ,x u x 2 ) = 0 is said to be non¬ 
singular if all the points in H F (L) are nonsingular for all extensions L of 
K. It can be shown that it is enough to check this for algebraic extensions 
of K. (In Chapter 11 we called this notion absolutely nonsingular). 

If two curves intersect at a point, one can define an integer called the 
intersection multiplicity of the two curves at the point. This is a somewhat 
delicate notion and we will not go into detail about it (see W. Fulton [135 ]， 
Chapter 3). In general, if L is algebraically closed, a line in P 2 (L) intersects 
a curve of degree d ind points if multiplicity is taken into account. To get an 
idea of why this is true, write x = Xi/Xo, y = x 2 /x 0 , and / (x, y) = F(l, x, y). 
We work for the moment in affine 2-space A 2 (L). To find the intersection 
points of / (x, y) = 0 with the line y = mx + b one simply substitutes for y 
and finds the roots of f (x, mx + ft) = 0. If F has degree d this latter equation 
will, in general, have degree d, and since L is algebraically closed there will 
be d roots if multiplicity is taken into account. The only exceptions will be 
intersections at infinity, in which case f(x，mx + b) will have degree less 
than d. 

As an example, consider F(x 0 , x u x 2 ) = —xl — xf + x 0 x\. Then 
/(x, y) — — 1 — x 3 + y 2 so the affine part of the curve is given by y 2 = 
x 3 + 1. The intersection with the line y = x + 1 is determined by (x + l) 2 = 
x 3 + 1 leading to the three points ( — 1,0), (0,1), and (2,3). On the other hand 
the line = 1 leads to the equation x 3 = 0. This is interpreted as saying 
that y = 1 intersects y 2 = x 3 + 1 at the point (0, 1) with multiplicity 3. 

The intersections with vertical lines x = c are determined by/(c, y) = 0. 

In our example, y 2 = c 3 + 1 so there are two finite points of intersection 

(c, >/c 3 -Ml) and (c, — ^/c 3 + 1) provided c 3 + 1 / 0. The third point of 
intersection is at infinity. If c 3 + 1 = 0, then (c, 0) is an intersection point 
of multiplicity 2. 

Finally, the intersections with the line at infinity x 0 = 0 can be obtained 
from the equation F(0, x 1? x 2 ) = —xf } so the point (0, 0, 1) e P 2 (L) is an 
intersection point of multiplicity 3. 
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If a e H F (L) then the tangent line to F at a can be shown to be an inter¬ 
section point of multiplicity two or greater. If the multiplicity is greater 
than 2 then a is said to be a flex point. 

If F is defined over K then a zero of F in P 2 (K) is said to be a rational 
point over K. 

We will say that a nonsingular homogeneous cubic polynomial 

F(x 0 ,x u x 2 )e Klx 0 ,x u x 2 ] 

defines an elliptic curve over K provided there is at least one rational point. 
The problem of determining all rational points on an elliptic curve has given 
rise to a vast body of theory. 

One of the things which make elliptic curves so interesting is the fact 
that the set of rational points can be made into an abelian group in a natural 
way. 

Let F(x 0 ,x 1 ,x 2 ) = 0 define an elliptic curve over K. If L is a field extension 
of K we will write E(L) instead of H F (L). 

Let 0 be an element of E(K). If P 2 e£(L) then the line connecting 
Pi and P 2 intersects the curve in a uniquely determined third point P 3 
which is easily seen to be in E(L). If P x = P 2 then the tangent line at P x 
gives rise to a third point P 3 . It is tempting to take P 3 as the “sum” of Pi 
and P 2 . However, this would not define a group structure since there would 
be no identity. What we do instead is to find the third point of intersection 
with E of the line connecting 0 with P 3 and call this new point P x + P 2 . 
With this definition E(L) becomes an abelian group having 0 as the identity 
element. The proof is not hard except for showing associativity, i.e” 

尸 i + (户2 + 尸 3) = (h + 尸2) + 尸 3. 

For a rigorous treatment of this construction see [135], Chapter 5, especially 
pp. 124 and 125. 

If the characteristic of K is not 2 or 3 it can be shown that every elliptic 
curve over K can be transformed into one of the form 

x 0 x\ = xf — AxqX x — Bxl, A,BeK. 

This curve has exactly one point at infinity, namely (0, 0, 1) e P 2 (K). We 
call this point oo and take it as the zero element of our group. 

The line at infinity x 0 = 0 intersects the curve at the point oc with multi¬ 
plicity 3. If x 0 9 ^ 0 set x = x 1 /x 0 and y = x 2 /x 0 . Then, in affine coordinates 
the defining equation of the curve is 

y 2 = x 3 — Ax — B. 

The point at infinity is thought of as lying infinitely far off in the direction 
of the y axis. 

A calculation shows that the nonsingularity of 


F(x 0 , x 1? x 2 ) = x 0 x\ — + AxqX x -h Bxl 
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is equivalent to the nonvanishing of 

A = 16(44 3 - 21B 2 ). 

This number is —16 times the discriminant of the polynomial 

一 A.x 一 B. 

Conversely if A ^ 0 then F defines an elliptic curve. 

The fact that oo is a flex point can be used to show that + P 2 + P 3 = oo 
iff P l9 P 2 , and P 3 lie on a straight line. In particular, — P is the third point 
of intersection of the line connecting P and oo. In affine coordinates this 
shows — (a, b) = (a, — b) since the line connecting (a, b) and oo is the vertical 
linex = a. The points of order 2 are those for which b — 0. If x 3 — Ax — B = 
(x — a x )(x — a 2 )(x — a 3 ) e L[x] then the points of order dividing 2 on 
E(L) are oo, (a u 0), (a 2 , 0), (a 3 ,0). 

As an example of how to add points consider Pj = (2,3) and P 2 = (—1,0) 
on y 2 = x 3 + 1. The line connecting P l and P 2 is given by y = x + 1. The 
equation (x + l) 2 = x 3 + 1 has three roots 2, —1，and 0 corresponding 
toP u P 2 and (0,1). Thus P x + P 2 = (0, - 1). 

Now suppose K = Q, the rational numbers. In 1922 L. J. Mordell 
proved the following remarkable theorem, conjectured by H. Poincare 
in 1901 [203]. 

Theorem 1 . Let E be an elliptic curve defined over Q. Then E(Q) is a finitely 
generated abelian group. 


In 1928 A. Weil extended this result to the case where Q is replaced by an 
arbitrary algebraic number field. The resulting theorem is referred to as the 
Mordell-Weil theorem. 

The subgroup E(Q) t ^ E(Q), consisting of points of finite order, is finite. 
It turns out that there is an effective method for computing E(Q) t in any 
given case. 

It was conjectured for some time that there is a uniform upper bound for 
|£(Q) r | as E varies over all elliptic curves defined over Q. It was noticed 
by G. Shimura and others that the theory of elliptic modular curves could 
be used to attack this problem. This point of view was extensively developed 
by A. Ogg who proved a number of partial results and made some rather 
precise conjectures. Finally, in 1976 B. Mazur proved the following very 
deep result which had been conjectured by Ogg. 


Theorem 2. Let E be an elliptic curve defined over Q. Then E(Q) t is isomorphic 
to one of the following groups: 1.1ml. for m < 10 or m = 12, or Z/2Z ㊉ 
Z/2mZ for m < 4. 
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It is also believed that there is a uniform upper bound for | E(K) t | where E 
varies over elliptic curves defined over a fixed algebraic number field K. 
This is not known to be true for a single such K ^ Q, but partial results 
have been obtained by V. A. Demjanenko, D. Kubert, and Y. Manin, among 
others. 

Another important integer associated to E(Q) has proved to be even 
more intractable, namely the rank. The rank of an abelian group is the 
maximal number of independent elements. If A is an abelian group we say 
a set of elements a u a 2 , ..., a t eA is independent if m 1 a 1 + m 2 a 2 + • •• 
+ m t a t = 0 with m u m 2 , • • • ， m t eZ implies = m 2 = = m, = 0. 

We denote the rank of E(Q) by r E . 

The rank r E has been computed for a large number of elliptic curves 
over Q. In most examples it is quite small; 0,1， or 2. A. Neron has shown the 
existence of an elliptic curve over Q with rank 11. His method is not con¬ 
structive. In 1977 A. Brumer and K. Kramer produced an explicit example 
with r E > 9. Here it is 

y 2 + 525x3 ； = x 3 + 228x 2 - 14972955x + (856475) 2 . 

It is not known if there is an upper bound on the numbers r £ , where E 
is defined over Q. Cassels considers this to be unlikely ([109], Section 20). 

One of the most celebrated conjectures in modern number theory con¬ 
nects the number r E with the order at s = 1 of an analytic function associated 
with E. This conjecture was formulated by the English mathematicians 
B. J. Birch and H. P. F. Swinnerton-Dyer. The formulation of their conjecture 
will be the task of the next section. 


§2 Local and Global Zeta Functions of an Elliptic 
Curve 


Let E be the elliptic curve defined over Q by the equation 

x 0 x\ = xf — AxqX { — Bxl ， A,BeQ (i) 

The affine equation is obtained by setting x = x x /x 0 and y = x 2 /x 0 . 

y 2 = x 3 — Ax — B. (ii) 

The transformation (x, y) -> (c 2 x, c 3 y) transforms this equation into 

y 2 = x 3 — c 4 Ax — c 6 B. (iii) 

Thus, we may assume to begin with that A, BeZ and we make this 
assumption from now on. The number A = 16(4A 3 — 21B 2 ) is called the 
discriminant of E. As we have seen A ^ 0. 
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Let p e Z be a prime and consider the congruence 

y 2 = x 3 — Ax — B (p), 
or equivalently the equation, 

y 2 = x 3 — Ax — 5, A, B e Z/pZ = ¥ p . (iv) 

This equation defines an elliptic curve E p over ¥ p provided that p 氺 A. 
In what follows only such primes will be considered unless explicitly stated 
otherwise. The curve E p is called the reduction of E modulo p. 

Let N p m be the number of points in E p (¥ p m), Then, as in Chapter 11, 
we may consider the zeta function 


Z(E p ， u) = exp( ^ N pr 


u 


m 


By use of the Riemann-Roch theorem it can be shown that 

1 — cijjU 4 ~ pu^ 


Z(E p9 u) 


(1 — w)(l — pu) 


a p e Z. 


(v) 


(vi) 


In special cases this can be proved using the methods of Chapter 11. 
H. Hasse was able to prove that a 2 p < Ap. It follows that 

1 — a p u + pu 2 = (1 — nu)(l — nu), (vii) 

where 元 is the complex conjugate of n. Clearly, nn = p, a p = n n. Also, 
\n\ = \n\ = y/p. This is the “Riemann Hypothesis” for elliptic curves 
over F p . 

By logarithmically differentiating (v), (vi), and (vii) and comparing 
coefficients one finds 


N pm = p m + I - n m - n m . 


(viii) 


In particular, N p = p + 1 — a p . Thus, if one calculates N p this deter¬ 
mines a p . Since n and n are the roots of T 2 — a p T + p = 0, Equation (viii) 
yields N pm for all m > 1. 

A very special case which will be useful later is the following. If N p = 
p + 1 then 


Z(E p ， u) 


l -h pu 


2 


(1 _ w)(l _ pti) 


It is useful to change the variable from u to p~ s . We define 


2s 


C(E P , s) 


1 _ a p p~ s -h p 1 

mm. 


(ix) 


The function C(E p ， s) is called the local zeta function of E at p. 
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It is illuminating to see that C(E p ， s) can be obtained from another point 
of view which makes the connection with the Riemann zeta function much 
clearer. 

The ring F p [x] and its quotient field F p (x) is analogous to Z and its 

quotient field Q. Let K = F p (x)(^/x 3 — Ax — B) and let D be the integral 
closure of F p [x] in K, i.e., D consists of all the elements in K which satisfy 
monic polynomials with coefficients in F p [x]. D is a Dedekind domain and 
every nonzero ideal is of finite index in Z). If / ci D is a nonzero ideal let 
NI — |D//|, and define Cd( s ) = where the sum is over all nonzero 

ideals in D. It is not hard to show that C D (s) converges for Re s > 1. More¬ 
over, one can prove that C D (s) = (1 — p~ s X(E p , s). See also Section 1 of 
Chapter 11. 

The point of view outlined here is that taken by E. Artin in his thesis [2]. 

We have defined C(E P ， s) for those primes p such that p 氺 A. If p|A we 
define 


C(£p ， S) = (1 - p~ s )(l -p 1 - s ). 

This is not the best definition but it will suffice for our purposes. 

Now that we have defined a local zeta function for all primes p, we define 
a global zeta function by simply taking the product of the local zeta functions. 

C(£, s) = Y\l ： (E p ,s). ⑻ 

p 

From the definitions we see that C(E ， s) = C(s)C(s _ l)L(E, s) _1 where 

L(E ， s) = nG— a pP~ s + P 1 " 25 ) -1 * ( xi ) 

P 氺 A 

The function L(E, s) is called the L-function of E. Recalling Hasse’s 
result that (1 — a p p~ s + p 1 ~ 2s ) = (1 — np~ s )(l — 元 p— s ) with \n\ = \n\ = 

yj~p one can show fairly easily that the product for L(£, s) converges for 
Re s > f. 

It was conjectured by Hasse that C(E ， s) can be analytically continued 
to all of C. This was first shown to be true in special cases by Weil [81]. 
After that M. Deuring proved the result for an important class of elliptic 
curves which are said to possess “complex multiplication.” 

Lang [169], Chapter 10, has an exposition of Deuring’s results. Y. 
Taniyama, and later A. Weil, conjectured that every elliptic curve over Q 
can be parameterized by elliptic modular forms. See the article by Swinnerton- 
Dyer in [226] for a precise statement of this conjecture. For such curves 
Hasse’s conjecture is true. Thus, the evidence for the truth of Hasse’s con¬ 
jecture seems overwhelming. 

Assuming L(£, s) can be continued to all of C it makes sense to speak of 
the analytic behavior of L(E, s) about s = 1. 
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On the basis of extensive empirical work on curves of the form y 2 = 
x 3 — Dx ， Birch and Swinnerton-Dyer were led to the following remarkable 
conjecture. 

Conjecture. Suppose E is an elliptic curve defined over Q. Then the rank of E ， 
r E ，is equal to the order of the zero of L(£, s) at s = 1. 

This conjecture can be supplemented. Assuming the conjecture we can 
define a nonzero constant B E = lim s —! (s — 1) 一 rE L(E ， s). Birch and Swinner¬ 
ton-Dyer give an expression for B E which depends on subtle arithmetic 
invariants of E. It would take us too far afield to discuss these here. See 
Cassels [109] or J. Tate [227]. 

In an important paper [114] published in 1977, J. Coates and A. Wiles 
made significant progress on the above conjecture. Their main result was 
subsequently generalized by N. Arthaud [87]. We would need to enter 
into the theory of complex multiplication to even state this result in full 
generality so we will be content with a special case. 

Theorem 3. Let E be an elliptic curve defined over Q and suppose that E has 
complex multiplication. If L(E, 1) ^ 0, then E(Q) is finite. 

Most of the work we have been discussing is of a very advanced nature 
and is beyond the scope of this book. In the following sections we will 
discuss elliptic curves of two types; y 2 = x 3 + D and y 2 = x 3 — Dx. 
For these curves we will analyze the local and global zeta functions and show 
on the basis of a fundamental result of E. Hecke that the global zeta function 
of these curves can be analytically continued to all of C. This will give the 
reader a sample, at least, of the extensive arithmetic theory of elliptic curves. 


§3 y 2 = x 3 + D, the Local Case 

Let D be a nonzero integer. We will consider the elliptic curve E defined by 
x 0 xj — xf — Dxl — 0, or in affine coordinates y 2 = x 3 + D. The dis¬ 
criminant A of £ is — 2 4 3 3 D 2 so we will only consider primes p ^ 2 or 3 
and p X D. 

The curve y 2 = x 3 D over F p has one point at infinity. Thus, N p = 
1 + N(y 2 = x 3 + D) where we use the notation introduced in Chapter 8. 
By means of Jacobi sums we will derive an explicit formula for N p . From 
now on we will write D instead of D so by u abuse of notation ” Z) will represent 
the coset of D modulo p. 

If p 三 2 (3) then x x 3 is an automorphism of F*. It follows easily (see 
Exercise 1) that iV p = p + 1 in this case. 
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If p 三 1 (3) let x be a character of order 3 and p a character of order 2 
of F*. Then 

N(y 2 = x 3 + D) = Y, N(y 2 = u)N(x 3 = —v) 

u+v = D 


=[(1 + P ⑻ )(1 + X (- v ) + x \- v )) 

u + v = D 


=p + Z + Z p( u )x 2 (v). 

u + v 二 D u + D 


We have used the fact that x( — 1) = 1- Making the substitutions u = Du 
and v = Dv r we find 

% = p + 1 + Pl{D)J{p, x) + PliD) J(p, xl (i) 

where bar denotes complex conjugation. 

In order to analyze Equation (i) still further the following lemma will 
be useful. 


Lemma. Let p be an odd prime, p a character of order 2 and £, any nontrivial 
character off*. Then J(p, = <^(4)J(<^, £,). 

Proof. 

0 = Z P(u)^(v) 

u+v~l 

=Z (14 - p(uM(v) = Z N(t 2 = u)av) 

M + y = 1 M + y = 1 

= ^( i - f 2 ) = ^)Z = 《 4 ) J ( ⑶.口 

Using the lemma, Equation (i) can be transformed into 

= P + 1 + PX(^)J(x, x) + ⑼ 

We want to specify p and x- Since p = 1 (3), p = nn in Z[cw] (recall that 
co = e 2ni/3 ) where we can take n and n to be primary, i.e., n = n = 2 (3). 
Let (a/n) 6 be the sixth power residue symbol and take p(a) = (a/n)l and 

X(a) = (a/n) 2 6 = (a/n) 3 . Then px(a) = p(a)x(a) = {a/n)% = {ajn\. Finally, 
if we set x n {a) = {a/n)^ then Lemma 1 of Section 4, Chapter 9 shows J(x n ,Xn )= 
n. Substituting this information into Equation (ii) we find 


2 




Theorem 4. Suppose p ^ 2 or 3 and p X D. Consider the elliptic curve y 
x 3 + D over ¥ p . Ifp = 2 (3) then N p = p + 1, If p = 1 (3) let p = nft with 
n e Z[co] and n = 2 (3). Then 


N p = p + 1 + 


O (v) 


元 . 


6 


Theorem 4 completely determines the local zeta function of y 2 = x 3 + D. 
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As an example consider the curve y 2 = x 3 + 1 over F 13 . We find 13 = 
(—1 + 3cu) ( — 1 + 3co 2 ) and — 1 + 3co = 2 (3). To apply the formula in the 
theorem we must know (4/—1 + 3co) 6 = (2/—1 + 3co) 3 . Since 2 (13_1)/3 = 
2 4 = 3 = co 2 (—1 + 3co) it follows that (2/—1 + 3cj) 3 = co 2 . The formula 
in the theorem gives 

A/"i3 = 13 + 1 + CO( — 1 + 3(D) + Ct)2( — 1 + 3(1)2) 

=14 + 2(co 2 + co) = 14 - 2 = 12. 

One checks that the points on = x 3 + 1 with coefficients in F 13 are 
①， （ 4, 0) ，（ 10, 0) ，（ 12, 0) ，（ 0, 士 1) ，（ 2, ±3) ，（ 5, 士 3)，and (6, 士 3). 


§4 y 2 = x 3 — Dx, the Local Case 

Let D be a nonzero integer and consider the elliptic curve E defined by 
x 0 x\ — x\ + Dx x Xq = 0 or, in affine coordinates, y 2 = x 3 — Dx. The 
discriminant of £ is A = 2 6 D 3 . We will only consider primes p such that 
p ^ 2 and p 氺 D. 

The curve y 2 = x 3 — Dx over F p (we continue to write D instead of D) 
has one point at infinity so that N p = l + N(y 2 = x 3 — Dx). The methods 
of Chapter 8 are not immediately applicable in this case. We will first trans¬ 
form the curve = x 3 — Dx into the curve u 2 = v 4 + 4D. The number of 
solutions to u 2 = v 4 + 4D can then be handled by our previous methods. 

For the moment let C denote the curve y 2 = x 3 ~ Dx and C denote the 
curve u 2 = v A + 4D. Define a transformation T as follows 

T(w, v) = (i(w + v 2 ), jv(u + v 2 )). 

A simple calculation shows that T maps C' to C. The point (0, 0) on C 
is not in the image since 4D = u 2 — v 4 = (u — v 2 )(u -h v 2 ) shows u + v 2 ^ 0. 

Define a transformation S by 

S(x’ y) = (2x _ 分 

It is easily shown that 5 maps C — {(0, 0)} to C' and moreover TS is the 
identity on C — {(0, 0)} and ST is the identity on C'. Let N f = N(u 2 = 
v 4 + 4D) and N = N(y 2 = x 3 — Dx). We have shown that N — l = N f . 

If p = 3 (4) then — 1 is a quadratic nonresidue so every element of \F p 
is of the form 土 w 2 . Thus every square is automatically a fourth power. 
Consequently, 

N f = N(u 2 = v A + 4D) = N(u 2 = v 2 + 4D) = p - l. 

Thus we find that if p = 3 (4), N p =l+N = 2 +N f = 2-\-p-l = 

p + 1. 
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Suppose now that p = 1 (4). Let 乂 be a character of order 4 of F p and set 
p = A 2 . Then, by the now familiar process, we find 

N(u 2 = i; 4 + AD) = Y. N(u 2 = r)N(v 4 = -s) 

r+s=4D 

= p ― 1 + 乂 ( _ 4Z))«/(p, A) + _ 4/))«/(p, X). (i) 

We have used the fact that for p = 1 (4), J(p, p) = — 1 (see Chapter 8, 
Section 3, Theorem 1). By the lemma of the previous section we have J(p,A)= 

A(4)J(A, X). Thus, 又 (一 4 寧 (p ， X) = I(D)A(-1)J(A, X). 

We now specify X. Since p = l (4), p = nn in Z[i] with n primary, i.e., 
7T = 1 (2 + 2i). Identify ¥ p with Z[i]/7rZ[i] and chose X to be the biquadratic 
residue symbol, X(a) = (a/n)^. Then, by Proposition 9.9.4 we have 

— 又 (— 1)J(A, A) = n. 

Starting from Equation (i) and substituting all this information we arrive at 


Theorem 5. Suppose p ^ 2 and p X D. Consider the elliptic curve y 2 = x 3 — Dx 
over ¥ p . Ifp = 3 (4) then N p = p + 1. If p = 1 (4) let p = nn with n e Z[i] 
and 7r = 1 (2 + 2i). Then 



As an example, consider y 2 = x 3 — x over F 13 . One sees 

13 = (3 + 20(3 - 20 

and 3 + 2i = 1 (2 + 2i). The formula of the theorem tells us that N 13 = 
13 + 1 — (3 + 2i) — (3 — 2i) = 14 — 6 = 8. In fact, a short calculation 
shows the points on }； 2 = x 3 — x with coefficients in F 13 are oo, (0, 0), 
(1,0), (-1,0), (5, ±4), and (-5, 土 6). 


§5 Hecke L-functions 

In two important papers published in 1918 and 1920 the German mathe¬ 
matician E. Hecke introduced a new class of characters and L-functions. 
These can be defined over arbitrary algebraic number fields. We shall 
confine our attention to algebraic Hecke characters over CM fields of a 
certain type (the terminology will be explained below). For the applications 
we have in mind this will suffice. 

Let K/Q be an algebraic number field. An isomorphism a of K into C 
is called real if cr(K) a R, otherwise it is called complex. K is said to be totally 
real if every isomorphism of K into C is real. K is said to be totally complex 
if every isomorphism of K into € is complex. K is called a CM field if it is 
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a totally complex quadratic extension of a totally real subfield K 0 . For 

example, if deQ with d > 0 then 0(^ / — d) is a CM field. Other examples 
are provided by cyclotomic fields Q(C m ). The totally real subfield of Q(C m ) 

is Q(C m + C 1 ). 

Let X c ： C be a CM field such that K/Q is a Galois extension. Let j 
be the restriction of complex conjugation to K. Then it is easily seen (Ex¬ 
ercise 2) that j is in the center of G, the Galois group of K/Q. Moreover, 
K 0 is the fixed field of j. From now on we assume K satisfies these conditions. 

Let (!) cz K be the ring of integers and M ^ (5 an ideal. An algebraic 
Hecke character modulo M is a function x from the ideals of ⑦ to C subject 
to the following conditions. 

⑴澗 =1. 

(ii) x(A) ^ 0 if and only if A is relatively prime to Af. 

(iii) X (AB) = X (A)x(B). 

(iv) There is an element 0 = ^ n(a)(T e Z[G] such that if a e (P, a = 1 (M), 
then x((a)) = a 0 . 

(v) There is an integer m > 0 such that n(a) -h n(j(r) = m for all aeG. 

The last condition is easily seen to be equivalent to (1 + j)9 = mN ， 
where N = [ a is the norm element in Z[G]. 

The number m in condition (v) is called the weight of x- 
Another thing to note is that by condition (iii) x is completely determined 
by its values on prime ideals not dividing M. 

Proposition 18.5.1. Let x be an algebraic Hecke character of weight m. Then 
if(A ， M) = (1), \ X (A)\ = NA m > 2 . 

Proof. Let I M be the set of ideals in (9 which are relatively prime to M. 
We put an equivalence relation on I M as follows; if A, Be I M we say 4B 
if there exist a 9 /3e& such that a, = 1 (M) and (oc)A = (P)B. It can be shown 
that the equivalence classes are finite in number and form a group C M . 
The product in this group takes the equivalence class of A and the equivalence 
class of B to the equivalence class of AB. If M = &, this construction yields 
the ideal class group of 6 (see Chapter 12, Section 1). Let h be the number of 
elements in C M . 

If A eI M there exist a, j? e a, jS = 1 (M), such that (oC)A h = (jS). Thus, 

oi G X (A) h = p°. 

Take complex conjugates of both sides and multiply. This yields 

^°y +j \x(A)\ 2h = (p e y +j , 

or，by (v) 

{Na) m \x{A)\ 2h = 

Since (a)A h = (jS) we also have 

NaNA h = N/5. 
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Comparing these last two equations we find \x{A)\ 2h = NA mh and 
\X(A)\ = NA m/2 . □ 

It should be noted that the proof shows that the values 义⑷ are algebraic 
numbers (in fact, hth roots of elements of K). This is a partial explanation 
of why x is called an “algebraic” Hecke character. 

We now proceed to attach an L-function to an algebraic Hecke character 
X- Namely, define 

L(s, x) = m-X(P)NP~r 1 

p 

=Y,x(A)NA~ s . 

A 

The product is over all prime ideals in ( 9 , and the sum over all ideals in 6. 
Simple estimates show that the product converges absolutely for Re s > 
1 + m/2 and uniformly for Re 5 > 1 4- m/2 + 6 for any ^ > 0. Indeed, the 
product converges absolutely if and only if YjP \x(P)^P~ s \ converges. 
By Proposition 18.5.1, if 5 is real 

\x(P)NP~ s \ = iVP (m/2)-s < p—( s — ( m / 2 )> 

where p is the rational prime below P. Since every rational prime has at 
most [K : Q] primes above it in & we see 

Z \x(P)NP~ s \ < [K: Q]^p~ (s_(m/2)) , 

p 

which converges for 5 > 1 + m/2. 

Using the fact that the product for L(s, x) converges absolutely for 
Re s > 1 + m/2 it can be shown the sum also converges in this region and 
that the two are equal. 

The crucial fact which we need about Hecke L-functions is given by the 
following theorem. We will not give the proof which is long and difficult. 

Theorem 6. Let 乂 be an algebraic Hecke character and L(s, x) the corresponding 
L-function. 15 _ equal to 0 or l for some A, then L(s, x) can be analyti¬ 

cally continued to an entire function on all of C. 

It should be pointed out that this theorem is true for all number fields 
and all Hecke L-functions, not only those which come from algebraic Hecke 
characters. Moreover, Hecke established a very important functional 
equation for his L-functions. When x is an algebraic Hecke character of 
weight m the functional equation relates L(s, x) with L(m -f 1 — s, /). 

Some authors normalize by defining %{A) : = x(A)/NA mf2 . Then L(s ， 戈 ） = 
P| p (l — x(P)NP ~ s )~ 1 converges for Re s > 1 using the same reasoning 
as for L(s, x) together with the fact that for (A, M) = (1) one has \ = 1- 

We will work directly with the Hecke character x- 
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In the next two sections we will show the L-function, L(E ， s )， for elliptic 
curves of the form y 2 = x 3 + D and y 2 = x 3 — Dx are Hecke L-functions. 
In the first case we will construct an algebraic Hecke character on Q(co) and 
in the second case on Q(i). 

One final comment. In Chapter 14 we defined by means of Gauss sums a 
function <!> ⑷ on the ideals of Q(C m ) which are prime to m. It can be shown 
that O(X) extends to an algebraic Hecke character for the modulus (m 2 ) 
of weight m. This was first shown by A. Weil in [81]. In a later paper [236] 
he points out that the case where m is an odd prime goes back to Eisenstein. 

§6 y 2 = x 3 — Dx, the Global Case 

We will now analyze the global zeta function of the elliptic curve E defined 
by y 2 = x 3 — Dx, D e Z. It is enough to consider the associated L-function 
L(£, s). Since A = 2 6 D 3 in this case we have (see Equation (xi) of Section 2) 

L(E ， s) = n 

pJriD 

The numbers a p are determined by N p = p+1— a p and N p has been 
determined in Theorem 5, Section 4. 

We are going to construct an algebraic Hecke character x °n Z[i] with 
respect to the modulus (8D) such that L(E, s) = L(s, x)- 

To construct x it is enough to specify ^(P) for prime ideals P in Z[f]. 
If P divides 2D define x(P) = 0. Suppose P does not divide 2D. If NP = p, 

then p = 1 (4) and P = (n) with n = 1 (2 + 2i). Define ^(P) = (D/n) 4 n. 
If NP = p 2 , then p = 3(4) and P = (p). Define x(P) = —p. 

Lemma. Suppose p = 3 (4). Then (D/p) 4 = 1. 

Proof. Let P be the prime ideal in Z[/] generated by p. Then (D/p) 4 = 
(D/P) 4 = D” 4 (P). Since NP = p 2 we have (NP — 1)/4 = (p 2 — 1)/4 = 
(p — l)(p 4- 1)/4. By Fermat’s Little Theorem D p ~ 1 = 1 (p) which implies 
(D/p) 4 = 1 (P) and so (D/p) 4 = 1. □ 

As a consequence of the lemma we can define x(P) uniformly for prime 
ideals P not dividing 2D. If P = (n) where 7i = 1 (2 4- 2i) then z ( 尸 ) = 
(D/n) 4 n. 

Theorem 7. Let E be the elliptic curve defined by y 2 = x 3 — Dx with D eZ. 
The character 乂 defined above is an algebraic Hecke character of weight 1 
for the modulus (8D). Moreover, L(E, s) = L(s, x). 

Proof. Assume to begin with that p = 3 (4) and pJ(2D. By Theorem 5, 
A/p = p + 1 so that a p = 0. Let P = (p). Then NP = p 2 and x{P) = —p. 
Thus 

1 - a p p~ s + p l ~ 2s = 1 -f p l ~ 2s = 1 - x{P)NP-\ 
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Now suppose p = l (4) and p 氺 2D. Write pZ[i] = PP, P = (n) and 

7r = 1 (2 H- 2i). Then, NP = p and, by Theorem 5, a p = {D/n)^ -f (D/n) 4 n. 
Thus 

1 - W = (1 - ㈢ ， S )(1 - ㈡ ， ◊ 

=(1 - x(P)NP~ s )(l - 乂 (P)NP- S ). 

We have used (D/ 元 ) 4 = (D/ii) 4 . Putting these facts together yields 
L(E, 5) = 0(! - X(P)NP~T 1 = I X(A)NA~ S = L(s, X ). 

P A 

It remains to show that x is an algebraic Hecke character of weight 1 
for the modulus (8D). 

It is clear for A relatively prime to 2D that x(^4) = (D/a) 4 a where a is 
the unique generator of A such that a = 1 (2 -h 2i). The theorem will be 
proved if we can show a = 1 (8D) implies (D/a) 4 = 1. To do this we will 
have to separate the cases D = 1 (4), D = 3 (4), and D even. 

If D = 1 (4), then by Proposition 9.9.8 we have (D/a) 4 = (a/D) 4 . Since 
a = 1 (D), (a/D) 4 = 1 and we are done in this case. 

Before going further we need a remark about (i/a) 4 . If a = 1 (8) we 
claim (i/a) 4 = 1. To see this note first that (i/a) 4 = i {Na ~ 1)/4 . If (x = a bi = 
1 (8), then a — 1 = 0 (8) and b = 0 (8). Thus, Na — l = a 2 + b 2 — l = 
(a 2 — 1) + b 2 = 0 (16). This proves the assertion. 

Now suppose D = 3 (4). Assume a = 1 (8D). Using Proposition 9.9.8 
and the above remark we have (D/a) 4 = (/ 2 D/a) 4 = (一 D/a) 4 = (a/D) 4 = 1. 

It remains to treat the case where D is even. Write D = 2 f D 0 where D 0 
is odd. Assume a = 1 (8D). By what has been proved to this point (D 0 /a) 4 = 1. 
It thus suffices to show (2/a) 4 = 1. For this we need a supplement to the 
law of biquadratic reciprocity. Namely, assume a = a -h bi is primary. 
Then 

_ Aa — b — b 2 — 1)/4 
— l • 

4 

A proof of this in the case when a is a prime element has been outlined in 
the Exercises to Chapter 9. It is not difficult to go from the case of a a prime 
to that of a primary. 

If a = 1 (8D) and D is even then a = 1 (16). It follows that a — 1 = 0(16) 
and b 三 0 (16) and so (1 4- i/(x) 4 = 1. Thus 

卞 ): = (!) 4 = 既 . 




The proof is now complete. 
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§7 y 2 = x 3 + D, the Global Case 

In order to analyze the L-function of the elliptic curve defined by y 2 = 
x 3 -f D, D e Z, we proceed as in the last section. Since the discriminant in 
this case is A = — 2 4 3 3 D 2 we have 

L(E ， s) = [] (l-a.p-^p 1 - 25 )- 1 - 

p Jr 6D 

The numbers a p are determined by N p = p+1— a p and N p has been 
determined by Theorem 4, Section 3. 

We will construct an algebraic Hecke character % on Z[co] of weight 1 
with respect to the modulus (12D), and show that L(£, s) = L(s, y). 

Let P cz Z[co] be a prime ideal. If P divides 6D define / (尸 ） = 0. Assume 
now that P )( 6D. If NP = p, then p = 1 (3) and P = (n) with n primary, 

i.e., n = 2 (3). Define x{P) = — (4D/n) 6 n. If NP = p 2 , then p = 2(3) and 
P = (p). Define x(P) = —p. 

Lemma 1. Suppose p is an odd prime and p = 2 (3). Then (4D/p) 6 = 1. 

Proof. It follows from the hypotheses that p -f 1 is divisible by 6. We know 
(4D) P ~ 1 = 1 (p). Raising both sides of this congruence to the ((p 4 - l)/6)th 
power gives the result. □ 

Lemma 1 permits us to give a uniform definition of x(P). If P ^ 6D write 
P = (n) with n = 2 (3). Then /(P) = —(4D/n) 6 n. 


Lemma 2. Suppose a e Z[co] and (a, 2D) = (1). Define (D/a) 2 to be (D/(x)l. 
Then (D/a) 2 = (D/Noc), where this last symbol is the Jacobi symbol (see 
Chapter 5, Section 2). 

Proof. Both (Z)/a ) 2 and (D/Na) are multiplicative in a. Thus it is enough to 
check that they are equal when a = 兀 ， a prime element. 

Suppose n = p ^2, a rational prime with p 三 2 (3). Then Np = p 2 and 
so (D/Np) = (D/p) 2 = 1. On the other hand 



= D (p2 ~ 1)/2 = (ZF -1 )( P+1)/2 三 1 (p). 


Thus, (D/Np) = 1 = (D/p ) 2 . 

Assume now that 丌 is a complex prime and so Nn = p = 1 (3). Then 



Since p = Nn, it follows that (D/n) 2 = (D/Nn) and the proof is complete. 
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Theorem 8. Let E be the elliptic curve over Q defined by y 2 = x 3 + D, D e Z. 
The character 乂 defined above is an algebraic Hecke character of weight 1 
for the modulus (12D). Moreover, L(E, s) = L(s, /). 

Proof. Assume first that p = 2(3) and p )( 6D. By Theorem 4, N p = p + 1 
so that a p = 0. Let P = (p). P is a prime ideal in Z[ci>] and /(P) = — p. 
Thus 

1 - a p p~ s + p l ~ 2s = 1 4 - p l ~ 2s = 1 - x{P)NP~ s . 

Now suppose p = 1 (3) and p)f6D. Write pZ[co] = PP where P = (n) 

with n= 2 (3). Then NP = p and by Theorem 4, a p = —(4D/n) 6 n — 
(4D/n) 6 n. Thus 



=(1 - x(P)NP- s )(l - 乂 (P)NP— S ). 


We have used the fact that (4D/ 元 ) 6 = (4D/n) 6 . Putting these facts together, 

L(E ， s) = Y\(l - X (P)NP~r 1 = Zx(A)NA~ s = L(s, X \ 

P A 

It remains to show that x is an algebraic Hecke character of weight 1 
for the modulus 12D. 

It is clear that for A relatively prime to 12D we have = (4D/a) 6 a, 
where a is the unique generator of A such that a = 1 (3). We will be done if 
we can show a = 1 (12D) implies (4D/a) 6 = 1. 

Since 1 = (4D/oc) 6 (4D/oi)l(4D/(x)l it is enough to show a = 1 (12D) 
implies (4D/a) 3 = 1 and, by Lemma 2, that (4D/N(x) = 1. We do both 
implications in turn. 

Assume 3 氺 D. Since a = 1 (3) and a is relatively prime to 4D, we have 
by cubic reciprocity (Theorem 1, Chapter 9) ， (4D/a) 3 = ( —a/4D) 3 = 
(a/4Z)) 3 = 1. The last equality follows from a = 1 (4D). 

If 3\D, write D = 3 l D 0 with 3 氺 D 0 . Then (4D/a) 3 = (3/a) f 3 (4Z) 0 /a) 3 = 
(3/a) f 3 . We must show a = 1 (12D) and 3|D implies (3/a) 3 = 1. The hypo¬ 
theses imply a = 1 (9). We need the supplements to the law of cubic reci¬ 
procity. These can be stated as follows. If yeZ[co] is primary, then y = 
a + bco = 2 (3). Write a = 3m — 1 and b = 3n. Then 



A proof is outlined in the Exercises to Chapter 9. Now, 3 = —co 2 (l — co) 2 
so a = 1 (9) implies (3/a) 3 = 1 as desired. 

It remains to show a = 1 (12D) implies (4D/Nol) = 1. Now, a = 1 (12D) 
implies N<x = 1 (4) and Na = 1 (D). If D is odd we have 


4D 


D 


Na 
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We have used the law of quadratic reciprocity. If D is even, write D = 2 l D 0 
with D 0 odd. Then 

(^\ = (l_\(Ro\ 

\N(x) \N(x j \N(x) 

The final thing to prove is that D even and a = 1 (12D) implies (2/N<x) = 1. 
The hypotheses imply a 三 1 (8) so that Na = 1 (8) and so (2/Nol) =1. □ 

We conclude by observing that Theorems 6, 7, and 8 show that for 
elliptic curves E of the form y 2 = x 3 — Dx or y 2 = x 3 + D, the L-function, 
L(E, s), can be analytically continued to all of C. This proves Hasse’s con¬ 
jecture for these curves! 



§8 Final Remarks 


In this chapter we have considered special types of elliptic curves defined 
over Q and investigated their local and global zeta functions. It is possible 
to generalize these considerations to algebraic varieties defined over algebraic 
number fields. We will go a short way along this path by considering curves 
defined by a single polynomial with coefficients in an algebraic number 
field. After giving the relevant definitions we will investigate the Fermat 
curves x l 0 + x[ -f x l 2 = 0, / an odd prime. In this connection we will en¬ 
counter a class of algebraic Hecke characters defined by Jacobi sums. 

Let K be an algebraic number field and (9 K its ring of integers. Let 
f(x 0 , x 1? x 2 )g6 ) [x 0 , x 1? x 2 ] be a nonsingular homogeneous polynomial 
of positive degree, and let C denote the algebraic curve defined by the 
equation f(x 0 , x u x 2 ) = 0. If P is a prime ideal of (9 we may reduce the 
coefficients of / modulo P to obtain a polynomial / e (9/P[x 0 , x 1? x 2 ]. It 
may be shown that there is a finite set of primes such that for P 余少 the 
reduced polynomial / is nonsingular. Let C P be the curve defined over 
(9/P by the equation f(x 0 , x x , x 2 ) = 0. In Section 1， Chapter 11, we showed 
how to attach a zeta function to C P . Namely, 

加、 V N m(P)u m 

Z(Cp, u) = exp 2, - ， 

m = 1 爪 

where N m (P) is the number of (projective) solutions to f(x 0 , x t , x 2 ) = 0 
in the extension of (9/P of degree m. Recall that this extension is unique up 
to isomorphism so that N m (P) is well defined. 

Using the Riemann-Roch theorem one may show there is a polynomial 
H(C P , u) e Z[w] with constant term equal to one such that 


Z(C P , u) 


H(C P , u) 

(1 - m)(1 - NPu) 


(i) 
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If P e ^ it is not easy to decide on the appropriate definition. For our 
purposes we simply define H(C P , u) = 1 if Pe^. 

The local zeta function of C at P is obtained by setting u = NP~ S in 
Equation (i). Namely, 

r(r H(C Pi NP~ s ) 

U _ (1 - NP~ S )(1 - iVP 1_s )" 。 

This generalizes Equation (ix) of Section 2. 

The global zeta function of C is defined by 

C(C, s) = Y\C(Cp,s) (iii) 

p 

The product is over all nonzero prime ideals in (9. 

The product — NP' 8 )' 1 is called the zeta function of K and is 

denoted by C K (s). This function was first investigated by Dedekind. It con¬ 
verges for Re(s) > 1 and it was shown by Hecke that it can be continued to 
a meromorphic function on all of C and satisfies a functional equation. 
The only pole is a simple pole at 5 = 1. 

Define L(C P , s) = H(C P , and L(C, s) = H P L(C P , s). Then 

from Equations (ii) and (iii) 


C(C, s )= 


Ck( s )Ck( s ~ 

~ ucj) ~~ 


(iv) 


It follows that if we wish to investigate whether C(C, s) can be analytically 
continued to all of C it is enough to concentrate on the function L(C, s). 

Fix an odd prime /. From now on we will consider the curve C defined by 
x l 0 4- x[ 4- x l 2 = 0. It will be convenient to consider C as being defined 
over K = Q(C/) rather than over Q. We set (9 = Z[CJ, the ring of integers 
in K. 

It is easy to see that the exceptional set ^ consists, in this case, of the single 
prime ideal ^ = (1 — Q. If P # we know / divides NP — 1. It is this 
fact which makes K a more convenient field of definition. 

Assume P ^ ^ and apply Theorem 2 of Section 3, Chapter 11 to the curve 
C P over O/P. We find 

H(C P , u)= (1 + NP- l g(x 0 )g(Xi)G(X 2 >X (v) 

where the product is over 3-tuples of characters of ((9/P)* of order / such that 
Z 0 Z 1 Z 2 = the trivial character. 

Since gixMXi) = J(Xu Z2MZ1Z2) and X0X1X2 = e we find that 
g(Xo)d(Xi)g(X2) = XiX2( - 1 )NPJ(XuX 2 ). Since -1 = (-iy, XiX2(-l) = 1 - 
Substituting this information into Equation (v) we find 

H(C P ， U)= n (! + Xi)ul (Vi) 

XUX2 

where the product is over pairs of characters of order / such that X 1 X 2 ^ 6 - 
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Let Xp(°0 = ( a / 尸 ) r 1 for a e (9. This is the inverse of the Ith power residue 
symbol (see Chapter 14, Section 2). If 1 < a, 5 < / — 1 and a + b 爹！ 
define A a b (P) = —J(x a p, Zp)- With this notation we have 

H(C P9 u)= n (1 - K, b {P)u) (vii) 

a, 5 = 1 
a + b^l 

and so 

L(C P ,s)= n (1 - 入 a ， b (P)NP- s )—K (Viii) 

a,b = l 
a + b^l 

Let us define ⑵ = 0 and L(s, OEUK 尸 ) ， T 1 . 
We have shown 


L(C,S)= n L(s 乂 ， ,)• 

a,b= l 
a + b 羊 l 


At this point it is certainly reasonable to hope that X a> b extends to an alge¬ 
braic Hecke character. This is indeed the case! X a b is an algebraic Hecke 
character of weight 1 for the modulus (l 2 ). The corresponding group ring 
element is 



E 

t=i 





(a -f b)t 




The proof of these facts will be outlined in the Exercises. Here we simply 
remark that since L(C ， s) is a product of Hecke L-functions，the fundamental 
result of Hecke, Theorem 6, shows that L(C, s) can be analytically continued 
to an entire function on all of C and, moreover, satisfies a functional equation 
connecting L(C ， s) with L(C, 2 — s). 


Notes 

The notion of local and global zeta functions attached to an algebraic 
curve defined over an algebraic number field goes back to Hasse. In the 
late 1930’s，Hasse proposed to one of his students the problem of showing 
that the global zeta function can be analytically continued to all of C and 
satisfies a functional equation. Weil was asked by G. DeRham for his opinion 
of this problem. At the time Weil could see no reason why the global zeta 
function should have the properties ascribed to it by Hasse. Moreover, 
he thought the problem too difficult for a beginner (“ … trop difficile pour un 
debutant... ?, ). For this and other enlightening comments see Weil’s Complete 
Works [241], Vol. II, pp. 529-530. 

In spite of his initial pessimism Weil later gained confidence in Hasse’s 
conjecture through working out special cases, initially y 2 = x 4 l (this is 
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equivalent to the curve y 2 = x 3 — ^x). His work along these lines cul¬ 
minated in his famous paper “Jacobi Sums as Grdssencharaktere” [81]. 
In this paper Weil treats curves of the form y e = yx f + 8 where 2 < e < f 
and yS ^ 0. At the end of the paper he notes the cases e = 2 and / = 3 or 4 
correspond to elliptic curves with complex multiplication. These are, in 
essence, the curves we have treated in this chapter. He goes on to say “it 
would be of considerable interest to investigate more general elliptic curves 
with complex multiplication from the same point of view.” This suggestion 
was taken up by M. Deuring with complete success. 

In passing it is worth noting that what we have called Hecke characters were 
called by Hecke “Gr6ssencharktere.” In the older literature algebraic Hecke 
characters are referred to as characters of type A 0 . 

In his 1954 paper “Abstract versus Classical Algebraic Geometry” 
[241] (Vol. II, pp. 550-558) Weil defines local and global zeta functions for 
a nonsingular algebraic variety defined over an algebraic number field. 
He raises the question of whether these functions can be analytically con¬ 
tinued to all of C and satisfy a functional equation of an appropriate type. 
Having verified that these properties hold in many examples, he writes, 
“It is tempting to surmise that this is always so, but I have little hope that a 
general proof may soon be found.” This conjecture is now known as the 
Hasse-Weil conjecture. Although there has been much progress due to 
Weil himself, Taniyama, Shimura, and others, the Hasse-Weil conjecture 
remains very much an open problem. 

For a comprehensive survey of the various zeta and L-functions that have 
been defined and studied since the nineteenth century see the article on zeta 
functions in the Encyclopedic Dictionary of Mathematics, Vol. II, Section 436 
(M.I.T. Press, 1977). 

Exercises 

1. Let p be prime p 三 2 (3) and consider the curve E p defined over F p by y 2 = x 3 a, 
a e F p . Show that N(y 2 = x 3 a) = p + 1 (projective points). 

2. Let X c= C be a CM field which is Galois and let j be the restriction to K of complex 
conjugation. Show that the fixed field of j is the unique totally real subfield of K of 
degree 士 [K : Q] and that ja = aj for all a in the Galois group of K over Q. 

3. Let A, B eZ, A = 16 (4/1 3 — 21B 2 ) # 0, and E be the elliptic curve defined by 

y 2 = x 3 — Ax — B. If p is prime, A let N p denote the number of projective points 

on the reduced curve E p over F p . The prime p is said to be anomolous for E if 
Z?=i((x 3 -Ax- B)/P) 三 一 1 (p). Put f p = - -Ax- B)/p). Show 

(a) p is anomolous for E iff p\N p . 

(b) Assume the Riemann hypothesis for E p (see Chapter 11, Section 3). Ifp > 5 then 
f p = {<=> p is anomolous for E. 

(c) Let B = 0, p 三 1 (4), Then f p is even. If p > 5 p is not anomolous. 

(d) If B = 0 then 5 is anomolous o A = 2 (5). 

This exercise is taken from Olson [202]. 
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4. Consider the underlying abelian group of rational points on the elliptic curve E, 
defined by y 2 = x 3 + c. If p 卞 6c then it is known that the torsion subgroup (i.e.，the 
points of finite order) of E is isomorphic to a subgroup of the torsion subgroup of the 
reduced curve modulo p. Use Exercise 1 and Dirichlet’s theorem on the density of 
primes in an arithmetic progression to show that the torsion subgroup of the above 
curve can have only 1 ， 2, 3, 4 or 6 elements. This exercise is taken from Olson [201]. 

In the following exercises the notation is as in Chapter 14, Section 3. Furthermore G = 
Z/mZ ㊉ Z/mZ, and T denotes the subset of G consisting of (a, b) with a ^ 0, b ^ 0, 
a b ^ 0. 


5. Generalize Exercise 13, Chapter 6 , as follows. If x = (x x , x 2 ),y = ( 3 ^ ， y 2 ) ^ G define 

<x, y} = + x 2 y 2 . For a C valued map / defined on G define f(x) = (1/m 2 ) 

Z, f(y)^ <x ' y \yeG. Show 

(a) f(x) = X, /WC，〉. 

(b) D I f(x)\ 2 = ( 1 /m 2 ) Y, x l/WI 2 - ， 

(c) Assume / maps G to the unit circle and / is integer valued. Show that / (x, y )= 
/(0, 0)OT + bfor a suitable (a ， b). Conclude that if/(0,0) = /(l, 0) = /(0,1)=1 
then / is identically 1 . 

6 . For (a, b)e G define, for P a D m , P a prime ideal, m_ P, fc (P) as follows: 

(i) If (a, b)eTA a , b {P)= -J(XP,X b P y ’ 

(ii) If (a, b) # (0, 0), a + 6 = 0 put b (P) = +Zp(-1). 

(iii) If a + 6 # 0 and a or Z? is 0 put k a b (P) = 1. 

(iv) Ko(P) = - (N(P) - 2). ’ 

Show that if one modifies the convention in Chapter 8 concerning the trivial char¬ 
acter by putting s(0) = 0 then X a b {P) = 一 《 /(# ， A) for all (a, h) e G. 

7. For (c, d) e G define N c d as the number of solution (x, y\ x,ye¥ q (q = N(P)) to the 
equations x + y = 1, x P {y) = CL and x P {x) = ^ c m . Show that J(x a p, Xp) = Zc，d 
^c,d^V bd - Conclude that -N c ， d = l c , d (P). 

8 . Extend X a b {P) to all ideals c= D m , by multiplicativity. Show 

(a) A 0 , 0 (3i ) 稱 e 1 (m 2 ). 

(b) If a eD m , a # 0, (a, b)eT then A a>fc ((a)) = u(a, b)a y(a,b \ where u(a, b)eD m , 

I u(a ， b)\ = 1 and 


y(a, b)= Y. 

(t,m)= 1 



(a + b)t 


m 




- 1 


(c) 

9. Assume a = 1 (m 2 ). Define u(a, b\ for fixed a, by Exercise 8 if (a, b) e T. If (a, b) ^ T, 
(a, b) # (0, 0) put u(a, b) = b ((a)), and w(0,0)=1. Show 

(a) u(a, b) = A fl , & ((a)) (m 2 ) for all (a, b) e G. 

(b) u(a, b)eD m , all (a, b) e G. 

(c) u(a, b) e Z, all (a, b) e G. 

(d) Apply (c) of Exercise 5 to show that u(a, b) = 1 for all (a, b) e G, and conclude 
that X a b is an algebraic Hecke character for D m with a defining modulus m 2 . 


Exercises 5-9 are adapted from Lang [171], Chapter 1, Section 4. 

10. Give an example of a nonabelian CM field. 



Chapter 19 

The Mordell — Weil Theorem 


In this chapter we prove the celebrated theorem of 
Mordell-Weil for elliptic curves defined over the field 
of rational numbers. Our treatment is elementary in 
the sense that no sophisticated results from algebraic 
geometry are assumed. It is our desire to present a 
self-contained treatment of this important result. The 
significance and implications of this theorem for con¬ 
temporary research in diophantine geometry are far- 
reaching. In the following chapter a summary without 
proofs of these developments to the present time is 
sketched. We hope that these two chapters will inspire 
the interested student to continue this study by con¬ 
sulting the more comprehensive texts on the arith¬ 
metic of elliptic curves listed in the bibliography to this 
chapter. 

Our proof of the Mordell-Weil theorem is based on 
Weil's 1929 paper [W4] (( Sur un theoreme de Mordell” 
and an interesting simplification of the (( weak Mor- 
dell-Weil” theorem appearing in J.W.S. Cassels's pa¬ 
per entitled ‘‘The Mordell-Weil Group of Curves of 
Genus 2” [Ca2]. 


§1 The Addition Law and Several Identities 

Let k be an arbitrary field of characteristic zero with a fixed algebraic 
closure k. Consider an elliptic curve E defined over k with an affine 
equation in Weierstrass form 

y 2 = x 3 + ax + b = f(x). (1) 

Here a and b are constants in k subject only to the condition that the curve 
E is nonsingular. It is a simple exercise to see that this is equivalent to the 
condition that/(x) have three distinct roots in k. Denote these roots by 6 \, 
02 , ^3 so that we have 

: y 2 = f(x) = (x - e { )(x - e 2 )(x - e 3 ). ⑵ 
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For completeness we include the proof of the following well-known result 
from classical algebra. 

Lemma 1. [(0i — 62)(62 - 沒 3)( 沒 i — O3)] 2 = — (4a 3 + 21 b 2 ). 

Proof. By substituting x = 0 i in the formal derivative of f ( x ) one obtains 
3dj -h a = (0i - 6j)(6i - 6 k ), i，j, ^ distinct. Multiplication now shows that 
the negative of the left-hand side of the statement in the lemma is 

27((9 ,(9 2 0 3 ) 2 + 9a(6 2 l e 2 2 + 0 \e] + 0 \0]) + 7>a\d\ + d\ + 0]) + a\ 

But 0\ + 0 2 + ^3 = 0, 0\d 2 + 沒 1 沒 3 + ^2 ❹ 3 = a, 8 '0 283 = —b, and several 
applications of the identity (jc + y + z ) 2 = x 2 + y 2 + z 2 + 2 (xy + xz + yz ) 
completes the proof. □ 


Recall from Chapter 18 that the nonzero quantity — 16(4a 3 + 21b 2 ) is 
called the discriminant A of the curve E. We will see in Chapter 20 that the 
prime divisors of A enter into the precise formulation of the conjectures of 
Birch and Swinnerton-Dyer. 

We view E as a projective curve whose points are the affine points 
(jc, y) satisfying ( 1 ) along with a single point on the line at infinity denoted 
by 00 . As mentioned in Chapter 18 the ‘‘chord and tangent’’ process 
defines a group structure on E. We now make this definition precise and 
derive several identities that are needed in what follows. 

The identity element is taken to be the point at infinity ⑺. The group 
structure is defined by the requirement that three points P, Q, R on E are 
collinear if and only if P + Q R = If P = (a, is a. point of E, then 
Q = (a, —f3) is also on E and P, Q, 00 are collinear. Thus P + Q — ^ and 
we see that —P = (a, -f3). The points of order 2 are therefore (0/, 0), i = 
1,2, 3. It is important to realize that these points need not be rational over 
k. Now let P = (x\, y\), Q = (x 2 , y 2 ) be two affine points on E with x\ + jc 2 . 
Intersecting E with the line through P and Q shows that the polynomial 
in x 


x 3 ax + b — 



fyi - y\ 

\X 2 — X\ 


U - ^i) 


has roots x\ and x 2 . Hence, if jc 3 is the third root, 


X] + X2 + ^3 = 


fyi - yi 

\X 2 — X\ 


so that 


X3 == ~X\ _ 又 2 + 


fyi - yA 

\^2 — X\I 
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If (x 3 , y 3 ) is the third point of intersection of the line between P and Q with 
E then 


= y\ 



fyi - y\ 

^x 2 - Xi 


- X,), 


and it follows that if P and Q are rational over k, then so is (x^, y 3 ). Now 
by definition of the group law one has 

P + Q + (x 3 , y 3 ) = 00 

or 


P + Q = -fe, J 3 ) = to, —y3). (4) 

Finally, if P = (x u yi), y\ + 0, we must derive a formula for 2 尸 . The 
tangent line to E at (x { , y\) has equation 


y = y\ 



(x — Xi). 


In other words, one calculates easily that the polynomial 


fM - 


y\ 


3x] 


a 




(Hi) 


has Xi as a double root. Again, as the coefficient of x 2 in f(x) is 0, one has 


2x\ + X 3 = 




where x 3 is the third root. Hence, 


+ ( H ^) 2 - 


If (x 3 , y 3 ) is the third point of intersection with E, one has 


= yi + 


3x] 


a 


2 y\ 


U 3 - Xl). 


Thus, 


and we see that 


2P + (x 3 , y 3 ) = °°, 


2P = —U 3 , y 3 ) = U 3 , -j 3 ). (5) 

As mentioned in Chapter 18, the proof of the associative law is not 
obvious. There we referred the reader to Fulton [135] for a geometric 
approach. Since the first edition of this text was published, several new 
texts have appeared. We recommend, in particular, J.H. Silverman [Si] 
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and D. Husemoller [Hus] for a thorough treatment of this matter. We do 
mention, however, that if one uses the parameterization of E by the 
Weierstrass p-function and its derivative (at least when k C C), then one 
sees that the group law is precisely the “addition formula” and 44 dupli¬ 
cation formula” from the classical theory of elliptic functions. Thus, the 
group law on E is the “transport” to 五 of the natural additive structure on 
the complex torus whose lattice defines the Weierstrass functions. Thus, 
associativity is “obvious.” It is, of course, a nontrivial fact that every 
elliptic curve arises in such a fashion. With our purely algebraic definition 
of the group law, the proof of associativity becomes a straightforward, if 
somewhat tedious, exercise in algebra. 

We see that the set of points on E that are rational over k form a group, 
denoted by E(k). This group is clearly abelian, as follows from the geo¬ 
metric definition of addition law and is visible again in (3). We may now 
state the main theorem of this chapter. Let k = Q, the field of rational 
numbers. 


Theorem. £*(0) is a finitely generated group. 


The addition formulas were obtained by using the fact that the sum of the 
roots of a polynomial is the negative of the trace term. Beginning with (2) 
and using the corresponding observation for the product of the roots we 
obtain relations that will be needed later. Replace xby t 6 in (2), where 
0 is one of the roots 6 \, 0 2 , O 3 . Let d\ 0" be the other two roots. 

If P = (x\, yi), Q = O 2 , J 2 ) are points on E, with x\ + x 2 , then, as 
before, the polynomial 



(yi - yA 

V^2 — X\/ 


_ 2 

(t 6 — Xi) — t(t 0 — 0 f )(t 0 — O n ) 


has roots x\ - 6 , x 2 ~ 0, x 3 - 0. Thus, 


(^1 - 0 )(x 2 - 0 )(x 3 - 6 )= 


j, + (0 - x,)l 


J2 - y\ 

<X 2 ~ X\ 


y\(x 2 - 0) - y 2 (x\ - 0) 

x 2 - Xi 


⑹ 


Similarly, if P = (x\, y\), yi ^ 0 is on E, then 

y\ + ’碧 :) (x - xoj - (x - d\)(x -沒 2 )(又 一 0 3 ) 

has x\ as a double root. Let, as usual, x 3 be the third root. By putting 
x - 0 = t one sees immediately that 


(x\ - Of (x 3 - 0)= 



r(x { ){e - x { )V 

2 yi - 
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or 


X3 - 6 


1 

~2y\ + Ox\ + a)(0 — X\Y 

(x\ — 6) 2 

L 2y, J 


Now 


y\ ^ x] + ax\ b - 0^ - ad - b 
=(x\ - d)(x 2 \ + X\6 + + a). 

Substituting into (7) gives the relation 

~x\ + 20 2 + 7.6 X\ 


- 0 


a 


2yi 


We require one final relation. From (3) one sees that 

-Ui - 4)(^i - + y 2 ~ 2y t y 2 + y] 


A 


⑺ 


( 8 ) 


U2 - A) 2 • 

Using = xj -h ax { + b and y\ = ^2 + ^ we obtain, after a simple 

calculation, 


又 3 


(x\x 2 + a){x\ + x 2 ) + 2b — 2y\yi 
to — A) 2 


⑼ 


In formula (9) we are assuming, of course, that Xj ^ x 2 . 

This completes the list of identities that will be needed in the proof of 
the Mordell-Weil theorem. 


§2 The Group E/2E 

In this section k remains an arbitrary field of characteristic zero. Using 
the notation of §1 consider the residue class ring k[x]/(f(x)) = k[^], f 
being the class of x modf(x). This ring is a 々 -algebra of dimension 3 over 
k. If f(x) =f\(x)f 2 {x). . . f n {x) is the decomposition off(x) as a product of 
distinct irreducibles, then 

n 

k[x]/(f(x)) = ® k[x]/(fi(x)) (10) 

by the polynomial version of the Chinese remainder theorem. Here n may 
take the values 1, 2, or 3 according as/(x) is, respectively, irreducible, the 
product of a linear and an irreducible quadratic, or the product of three 
distinct linear factors. If one of the factors is linear, say/ i(jc) = x - a, 
then, of course, k[x]/(f\(x)) is naturally isomorphic to k by the map which 
sends the class of ^(jc) to g(a). The linear factors x - a correspond to the 
众 -rational points (a, 0) of order 2 on the elliptic curve E defined by y 2 = 
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/(jc). Denote the group of units of the ring k[^\ by U. These elements are 
the residues modulo /(jc) of polynomials h(x) that are coprime to f(x). 
Furthermore, this group of units is isomorphic to the direct product of the 
unit groups of the factors in the preceding decomposition. 

The purpose of this section is to construct a homomorphism 0 from E 
to the group UIU 1 with kernel precisely 2E. In the context of the Weier- 
strass parameterization of E by elliptic functions this map was defined by 
Weil in his 1929 paper [W4]. We follow an algebraic adaptation and sim¬ 
plification of WeiFs construction due to Cassels [Ca2]. 

The mapping </> is defined as follows: First, = 1 where 1 denotes 
the identity of the group UIU 2 . Next, if P = (a, f3) is a point on E distinct 
from the points of order 2, i.e., /3 ^ 0, then since a - x is prime to f(x), 
a - ^ is in U and <i>(P) is defined to be the image of a - ^ in UIU 2 . It 
remains to define </>( 尸 ） when P = (a, 0) is a point of order 2 on E. Write 
f(x) = (x — a)g(x). Then 

k[^] = k[x]/(x - a)® k[x]/(g(x)) (11) 

where 幺 (x) is a polynomial, not necessarily irreducible, of degree 2. Iden¬ 
tifying the first factor in the preceding decomposition with k, as men¬ 
tioned earlier, we see that the element (/’(a)，（a - jc)mod ^(jc)) is a unit 
corresponding in k[^] to a unique element, say h{^) mod U 2 , in UIU 2 . The 
reason for the choice/'(a) in the component where a - ^ ceases to be a 
unit is made partially clear by the proof the map 炎 so defined is indeed a 
homomorphism. For an explanation using a little more algebraic geometry 
see Cassels’s original paper ([Ca2], §1.3). See also §2 of Brumer and 
Kramer [Br-Kr]. 

Lemma 2. 0 is a homomorphism. 

Proof. If 尸 =(a, /3), since the definition of </> is independent of (3 and 
-P = (a, -f3), then (f>{P) = Now if p is in UIU 1 , then p 2 = 1, 

so that <j)(P + 0) = is equivalent to </>( 尸 + = 

<f)(P + Q)(f>(-P)<f>(-Q) = 1. Thus, to establish that </> is a homomorphism 
we must show that if A + 万 + C = ① on £"，then 

- 1 ( 12 ) 

in Ul U 2 . The condition A + B + C — 00 is, by § 1, simply the condition that 
A, B, C, are colinear. Put A = (x\, B = (x 2 , ^ 2 ), C = (x 3 , y 3 ), and 
assume that A, B, C are distinct points. If jc! = x 2 , then the points are A, 
—A and infinity. The result follows noting that = <f)(A). Let X\ ^ x 2 

and assume none of the points has order 2. The collinearity of A, B, C 
simply amounts to the existence of a linear form cx + d such that 

f(x) - (cx + d) 2 = (x - x\){x - x 2 )(x - x 3 ) (13) 

and the result follows by reduction mod f(x). Next suppose that precisely 
one of the points, say, A = (a, 0), has order 2. We check (12) in each of 
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the two summands of (11). The result holds in the second factor by reduc¬ 
ing (13) mod g(x). Furthermore, differentiating (13) and putting x — a 
shows/'(a) = (a — x 2 )(a - x 3 ) so that, by definition, the first component 
of (12) is (/’(a)) 2 . The final case to check is A = (60), B = (d 2 , 0), C = 
(^ 3 , 0), but again differentiating (13) and putting x — 0 / one sees that the 
three components of (12) in the decomposition of /:[^] as the direct sum of 
three copies of k corresponding to the roots of f(x) are the squares 

no x )\n6 2 )\ne,f. □ 


We mention that one can also use the explicit formulas (6) and (7) of § 1 
applied to the various factors in the decomposition (10). Once again we 
refer to Cassels [Ca2] for a proof of this statement that avoids the exami¬ 
nation of special cases. The last result of this section is the proof that the 
kernel of </> is 2E. 

Lemma 3. ker </> = 2E. 

Proof. Since <i>{2P) = <\>{P) 2 = 1 we see that 2E C ker Thus, consider 
a point P, which we may assume different from oo, such that = 1. 

Write P - (a, /3), a, f3 ^ k. Then a - ^ is a square in k[^]. Note that this 
holds even when 2P = oo, for then a - ^ is 0 in one of the components of 
(10). Thus, we may write 

a - ^ = (q ： i^ 2 + a 2 ^ + a 3 ) 2 ， （ 14) 

where a\, a 2 , E k. It is easy to see, using f 3 = —a^ — b, that one can 
write 

+/i = (ai^ 2 + a 2 ^ + « 3 )(-ai^ + a 2 ), (15) 

where e x ,f\ E k. Now a\ ^ 0, for otherwise, by linear independence of 1, 
^ 2 , (14) would give a contradiction. Thus, squaring (15), substituting 
(14), and dividing by a] gives the relation 

(e^ + e') 2 = (a - ^){h - O 2 (16) 

for a, e, e\ h E k. This implies that (ex + e f ) 2 - (a - x)(h - x) 2 is a 
multiple off(x), and, since the latter polynomial is a monic cubic, we see 
that 


f(x) = (ex -h e f ) 2 — (a — x)(h — x) 2 . (17) 

But geometrically this says that the line y = ex -he' intersects E at (a ，（ 3) 
or (a, -p) and (h, t) for suitable t with (h f t) counted twice. Thus, by 
definition of the group structure on E we have 

(a, ±/3) + 2(h ，0 = 0 


for a suitable choice of the sign of /3. This implies that 


P 二 （ a ， p) = 2Q 
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for Q = (h, ±/)，again adjusting the sign of t. We have thus shown ker 0 
C 2E. □ 


§3 The Weak Dirichlet Unit Theorem 

If the ground field k is an algebraic number field, then the existence of an 
injection of E/2E into UIU 2 can be used to show that E/2E is a finite 
group. This result is often referred to as the Weak Mordell-Weil Theo¬ 
rem. To derive this result one needs, in addition to the finiteness of the 
class number of an algebraic number field, the Dirichlet unit theorem. The 
fact that the group of ideal classes is finite is proved in Theorem 1 of 
Chapter 12. The structure of the group of units in the ring of integers of an 
algebraic number field is stated without proof on page 192, Chapter 13. 
However, the full statement of the unit theorem is unnecessary if one is 
interested only in the finite generation of £(Q). What is needed is only the 
fact that the group of units of the ring of integers of an algebraic number 
field is finitely generated, and this follows without difficulty, via the stan¬ 
dard “logarithmic embedding，” from the fact that a discrete subgroup of 
the additive group is a lattice. In view of our desire to keep the proof of 
our main result self-contained, we include a proof of this weaker form of 
the Dirichlet unit theorem. Those who are willing to accept this fact can 
proceed directly to the following section where the proof of Mordell-Weil 
is concluded. 

Let K be an algebraic number field of degree n. We consider as in 
Chapter 12 the n distinct isomorphisms from 尺 to C，but we order them in 
the following way: Let cri, . . . , cr 5 be the isomorphisms such that 
<Ji(K) C R. The remaining isomorphisms occur in distinct conjugate pairs. 
There are t such pairs, and we choose one from each pair, denoting these 
elements by a s+i ，. . . , cr 5+/ . The set all n isomorphisms is then 
{(7i, . . . , cr s , cts+i ， ^+ 1 , . . . , o- s+t , ov7J, which we also list as 
{tj, . . . , T n ) when a uniform notation is convenient. Let V = U s x C ’， 
and define a mapping </> from to V by <p(a) = (cri(a), . . . , cr 5+r (o：)). 
Fix an integral basis a! • • • of K/Q. Then by Proposition 12.1.3, 
(det(ry(o ： i))) 2 is not zero, being the discriminant of K. It is then a simple 
exercise to show that the vectors . • • , are R-linearly inde¬ 

pendent in V. Now a lattice in V is, by definition, an additive subgroup of 
V, which may be written in the form Zv\ + • • • + Zvi, where v\, . . , ,vi 
are R-linearly independent elements of V. If D is the ring of integers in K, 
we have shown the following: 

Lemma 1. </>(D) is a lattice. 

By a discrete subset of U n is meant any subset A for which A D T is 
finite whenever T is compact. It is, of course, sufficient to take T a closed 
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ball of finite radius. It is a simple exercise to show that a lattice is discrete. 
We need the following converse. Let W be any finite dimensional vector 
space over U. 


Lemma 2. A discrete additive subgroup A of W is a lattice. 

Proof. Let i ； i, . . . , be a maximal set of [R-independent elements in 
A. Then any element a of A may be written in the form a = r x v\ + - - • + 
r m v m where E R. Now A contains the lattice T = Zv\ + • • • + Zv m . If 
T = {c\V\ + • • • + c m v m I 0 ^ c 2 ^ 1}, then T is compact and clearly any 
element a E A can be written as y + ^ for y E T and t E r fl A. But 
r fl A = {q, • . • , a r } is a finite set. It follows that r is a subgroup of 
finite index in A and so JA C T for some positive integer d. Then A C 
1/d • r where 1/d • T is a free Z-module of finite rank m. By a standard 
result in algebra A is then a free Z-module of rank / ^ m, generated by, 
say, wi, . . . , w/. Now Vi, v m E A and the R-module generated 

by them has dimension m and is contained in the R-module generated by 
wi, . . . , wi ，It follows that m = l and w\, . . . , w m are R-linearly 
independent. Thus, we have shown that A is a lattice. □ 

In order to discuss the structure of the group of units of D in the 
context of lattices, we define a map 入 from the open subset of IR^ x O 
consisting of all points no coordinate of which is zero to U s+t by 
入 (“I ， . . . ， ^5+f) — (In I j I, • • . ， In I oc s 1, 2 ln|o^+i|，• • • ， 2 ln|o^+,|). 
Then /x = 入 </> maps K to U s+t , It is simple to see that 入 -1 (7") is compact 
when T is a compact subset of U s+t . The map fi is clearly a homomorphism 
from the multiplicative group to the additive group which, by 
lemma 2, §5, Chapter 14, has the group of roots of unity in K as kernel. 
Denote by % the group of units of D. 


Lemma 3. is a lattice. 

Proof. If T is a compact set in U s+t , then S = \~ l (T D C 0(D), and 
the comment preceding this lemma together with lemma 1 shows that S is 
finite. Hence, T D jx(%) is finite, and thus, /x(^) is discrete. But clearly jx is 
a homomorphism from the multiplicative group to the additive group 
and lemma 2 now shows that is a lattice. □ 

Lemma 4. % is finitely generated. 

Proof. Choose a lattice basis for say, u\, . . . , v h and let u\, 

•.., ⑷ be units with /x(w z ) = v t . If u E%, put fx(u) = c x V\ + • . . + c 
Ci EiZ.lf ^ = “一 W 2 • • . w/(' then clearly = 0. This implies, by the 
comment preceding lemma 3, that ^ is a root of unity. But the set of roots 
of unity in K is finite and u = ^ x u\ [ u C 2 - - - - uj 1 . □ 
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§4 The Weak Mordell-Weil Theorem 


Assume now that the ground field is the field Q of rational numbers. In §2 
we established the existence of a homomorphism (f> from the group £"((□), 
of Q-rational points on E to the multiplicative group UIU 2 , where U is the 
group of units of the ring R = Q[x]/(/(jc)). As in that section, R may be 
identified with the direct sum of the fields Q(6i) = K h and the image 
4>(E(Q)) may be viewed as a subgroup of the direct product of the groups 
Kf/(Kf) 2 = Gi ，With these identifications we will show that if 尸 is a point 
on E(Q), then the ith component of 4>(P) lies in a finite subgroup of G,. 
We may assume that P + ① and write P = (a / 卩， w), where a and (3 are 
coprime rational integers. Let 6 = Si be a fixed root of /(jc), and write 
f(x) = (x - O)g(x). Then a - /30 and h a ^ = g(a//3)/3 2 are algebraic integers 
in 尺 = Ki ， and we put I(P) = (a - /36, h a ^), the ideal generated by them. 
In the remainder of this section, all algebraic integers, ideals, and units 
are in the ring of integers of K. 

Lemma 1. The set of ideals 1( 尸 ）& finite. 

Proof. g(x) - g(0) = (x — O)t(x), where t(x) is a linear polynomial 
with coefficients in Z[0]. Substituting x = a//3 gives g{6)^ 2 = h a ^ - 
(a - /36)t(a/f3)/3. Hence, g(d)/3 2 E 1( 尸 ). Similarly, one calculates 

g(0)x 2 - g(x)6 2 = g(d)(x 2 - 6 2 ) + 0 2 (g(6) - g(x)) 

=(x - 6)l(x), 

which shows, putting jc = a/f3, that 

g(6)a 2 = (a — pd)l(a/(3)f3 + 0 2 g(a/f3)p 2 . 

Hence, g(0)a 2 E 1( 尸 ） • It follows that I(P) divides the ideal 
(g(0)a 2 , g(0)f3 2 ). But a and p are relatively prime, and we conclude that 
1( 尸 ) divides the principal ideal But g(6) + 0, and therefore, ( 幺 (0)) 

has only a finite number of ideal divisors. □ 

The denominator /3 of the first coordinate of P is the square of an 
integer. This elementary fact is shown, in a homogeneous context, at the 
beginning of the next section, but it can be seen, without fear of redun¬ 
dancy, quickly as follows. If w = cld ， (c ， d) = h then f3 3 c 2 = d 2 (a 3 + 
aa/3 2 + b/3 3 ). Then /3 3 \d 2 and using (c, d) = l we conclude f3 3 = d 2 , from 
which it follows that 卢 is a square. 

Lemma 2. (a — /3d) = l(P)C 2 for some ideal C. 

Proof. Since I(P) is the greatest common divisor of a — f38 and h a ^, we 
may write (a — (36) = I(P)A, (h a ^) — l(P)B, where A and B are coprime 
ideals. But P E £"(Q), so there is a rational number r/s so that (r/s) 2 = 
Thus, /3 3 r 2 = s 2 (a - f3d)h a ^. It follows that {s) 2 \{P) 2 AB is a square 
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and, since A and B are coprime, we conclude that A is the square of an 
ideal. □ 

Recall from Chapter 12 that the group of ideal classes of the ring of 
integers in is finite. Let Cj, . . . , C/, be representatives for the ideal 
classes. Then, by definition, if J is an ideal, there is an index / and alge¬ 
braic integers /x, v such that /xJ = vCi ， 

Lemma 3. There is a finite set of algebraic integers S such that for any 
P — (a, (3) E E(Q) one can write 

a — ^0 = uyr 2 

for a suitable unit u, an algebraic number t, and y G S. 

Proof. If C is as in the preceding lemma, then C is equivalent to C s for 
some s. Therefore, 1( 尸） C? is eqivalent to the principal ideal 1( 尸） C 2 = 
(a - pd) and is thus a principal ideal, say, (y). By Lemma 1 the set {(y)} is 
finite depending only on E(Q) and not on the particular P. Now there exist 
algebraic integers p, t\ such that pC = T\C S . Hence, (p 2 (a - (36))= 
1 (P)t\C 2 s = (yri). It follows that p 2 (a - pd) = uyr] for some unit u. The 
lemma follows by putting t □ 

We may now prove the finiteness of EI2E. 

Theorem 19.4.1. E/2E is a finite group. 

Proof. It is enough to show that cj)(E) is finite. We may assume that P 丰① 
and that P does not have order 2. Then 4>(P) is defined as the coset 
modulo V 2 of a//3 — x, where P = (a/p, w), in the group U. If we consider 
the /th component K = Ki = Q(6i) of Q[x]/f(x), the preceding lemma 
shows that in Kf/(Kf) 2 the image of a/p — ;c is the coset of (l/j8) uy. As we 
have seen, (3 is the square of a rational integer, and by the weak Dirichlet 
unit theorem, the group % of units in the ring of algebraic integers of K is 
finitely generated with basis, say, u\, u 2 , . . . , u t . It follows that the 
coset of {l//3)uy mod (尺 *) 2 has a representative of the form 
u e \u e 2 ' ' ' u] 1 y, where each e，is 0 or 1. Since y varies over a finite set, the 
/th component of the image is finite and the result follows. □ 


§5 The Descent Argument 

In this section we take the ground field k to be the field of rational num¬ 
bers Q. The algebraic closure li is then the field of algebraic numbers, 
which we assume to be a subfield of the complex numbers C. If a g C, 
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denote by |a| its ordinary absolute value. The coefficients a and b of the 
elliptic curve E are assumed to be rational integers. We write the equation 
defining E in homogeneous form 

y 2 z = jc 3 + axz 2 + bz 3 (1) 

and use homogeneous coordinates (x 0 , y 。， ^o) fora point P on E. Thus, x 0 , 
y 0? Zo are determined up to a nonzero proportionality factor, and since P is 
assumed to be rational over Q, we may assume that x 0 , yo, Zo are integers 
with greatest common divisor 1. Suppose P + ① and put Z 0 = gcd (x 0 , z 。)， 
X 0 = jc 0 /Z 0 so that jc 0 = Z 0 Z 0 , Z 0 |x 0 , Z 0 |zo- Finally, for uniformity of 
notation put Fq = y 0 . 

From (1) one sees immediately that 

X\Z\ = zo(Yo - ax 0 Zo - bzl). ⑵ 

Now gcd(Z 0 , K 0 ) = gcd((x 0 , Zo) ? : Vo) = 1 and Z 0 |z 0 so that zo is coprime to 
the second factor on the right-hand side of (2). Hence, Zl\zo and we may 
define t by Z\t = z 0 * Substituting this value of zo for the first term on the 
right side of (2) and canceling gives the relation 

= t(Yl - ax 0 zo - bzl). ⑶ 

Now 1 二 gcd(jc 0 /Z 0 , Zo/Z 0 ) = gcd(Z 0 , Z\t). Since for any prime p such that 
p\t one has p\X Q it follows that, after a sign adjustment, t = \. Thus, zo = 
Zo and gcd(Z 0 , Z 0 ) = 1. We can therefore write (x 0 , y 0 , zo) = (Z 0 Z 0 , Yo, Zl) 
with (A" 0 , Z 0 ) = (Fo, Z 0 ) = 1. The corresponding affine coordinates for P 
now become {XJZl, where we observe that both terms are written 

in lowest terms. In what follows an affine point on E is always written in 
this form. 

Substituting the values X 0 Zo, Zi ， respectively, for x 0 , Zo in (2) gives 

Yl^ Xl + aX 0 Z 4 0 + bZl ⑷ 

We now introduce the important concept of the height H{P) of a point 
P on E that is rational over Q. First, H(^) = 1. For P = (XoZ 0 , Y 0 , Zq) we 
define 

H(P) = max(|Z 0 |, Zo). (5) 

Thus, it is the maximum, in absolute value, of the numerator and denomi¬ 
nator of the first coordinate of P in affine form. It may be thought of as a 
measure of the ‘‘size’’ of the point P. This function and its associated 
logarithmic function are discussed further in Chapter 20， where it will 
enter in an important way in the precise statement of the Birch and Swin- 
nerton-Dyer conjectures. 

The descent argument in the proof of the Mordell-Weil theorem re¬ 
quires an estimate of the rate of growth of the height of P when P is 
doubled. 

From relation (4) we see that 

\Y 0 \ ^ Vl + |a| + \b\ {H{P)) m ^ CH{P)' (6) 
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where C is a positive constant depending only onE and not on the particu¬ 
lar point P. The following basic lemma follows immediately from the 
definition and (4): 

Lemma 1. If C is a fixed constant, then there are at most a finite number 
of points P on E，rational over Q with H(P) ^ C. 

Now fix a point Q ^ on E, and let P be any Q-rational point on E such 
that 尸 —0 is not of order 2, i.e., 2 尸 ★ 2Q. Write, using homogeneous 
coordinates as above, 


Write 2P = (2P -0 + 0, where, by assumption, the two points on the 
right side are distinct. Now apply (9) of § 1 to conclude that, writing 2P = 

(X3Z3 , ^3 , zl), 

a: 3 _ (x 2 /z 2 c/e 2 + a)(x 2 /z 2 + c!e 2 ) + — 2 y 2 d/z 2^ 3 

z] (c/e 2 - x 2 lz 2) 2 

_ (x 2 c + azle 2 ){x 2 e 2 + czl) + 2 bz\e A - Iziyide 

(czj - e 2 x 2 ) 2 

Denote the numerator and denominator, respectively, of this last ex¬ 
pression by A and B. Since gcd(jc 3 , z 3 ) = 1 we see that x 3 |A, z]\B. In 
particular, |jc 3 | ^ |A|, z] = \B\, and by definition of height, we see that 
H(2P) ^ max(|A|, | 方 |). From ⑹ we know that |y 2 | $ C\(H(2P - Q )) V2 

for a constant C\ and trivially IZ 2 I = H(2P — Q) V2 . We conclude, examin¬ 
ing the expressions for A and B, that 

H(2P) ^ CH(2P - Q) 2 , (8) 

where C is a constant depending only on Q. 

Recall that f(x) = (x - ^j)(x — 02 )(x - 6 ^), where now 6 i, 62 , O 3 are 
distinct algebraic integers. The discriminant of/(x), —(4a 3 + 27b 2 ), is a 
nonzero rational integer which we denote by 8. From relation (8) of § 1 ， we 
conclude, after simplification, that for the point P and its double 2P one 
has the relation 


又 3 - Oiz] = z] 


20fzi + 2^/Xizj + azt 
2z\y\ 


a ；. 


Since the left-hand side is an algebraic integer, it follows that the a, are 
algebraic integers. Write, furthermore, 


A + Bdi + Cdj = Z 3 


x 2 i + 2djz 4 \ + 26jXiz 2 \ + az\ 

2ziyi 




332 


19 The Mordell-Weil Theorem 


where A, B, C are rational numbers. Cramer’s rule shows that 8 A, dB ， 8 C 
are elements of Z[aj, a 2 , « 3 , 沒 1 ， d 2 , B 3 ] that are, in fact, linear in a 2 , 
a 3 . Thus, 8 A, 8 B, 8 C are algebraic integers that are rational, and we 
conclude by Proposition 6.1.1 that they are, in fact, integers. From (9) one 
sees easily that 


2A — aC 


C 








z\yi 

_Z3_ 

z\y\ 


Z\. 


( 10 ) 


Therefore, 6(2A - C) = - x\ and 8C = ^z^h\y\ - z\ are rational 

integers. If we write bz^lz\y\ - min, gcd(m, n) = \, then mx] = nR ， mz\ ^ 
nS for integers R and S. But gcd(jc!, z\) = 1. Hence, n = 1 and we have 
consequently established the fact that Sz^/ziyi is an integer. It now fol¬ 
lows from ( 10 ) that 


x] ^ \d(2A - C)\ 

z^\dC\. (ID 

Since a] - - 6iz\ we see that |«/| ^ C\ VH(2P) for a suitable constant 

Cj. Furthermore, as noted, 6(2A - C) and 8C are linear combinations of 
«i, a 2 , oil with coefficients in Q(^i, O 2 , O 3 ). It follows now from (11) that, 
for a suitable constant C 2 , one has 


H(P) ^ C 2 H(2P) m . (12) 

Combining this relation with ( 8 ), we arrive at the important result that 
there exists a constant C 3 depending only on Q such that for any point P 
we have 


H(P) ^ C 3 (H(2P - 0)) 1/2 . (13) 

Here C 3 has been adjusted to handle the finite number of exceptional P for 
which 2P = 2Q. Now allowing Q to vary in a fixed finite set , 

Q ni) we have shown the following lemma. All points are assumed rational 
over Q. 

Lemma 2. Let{Q\, . . . , Q n ^\ be a fixed set of points on E. Then there is 
a constant C depending only on E and this set such that for any point P 
one has 


H{P) ^ C(H(2P - 2 /)) 1/2 ， / = 1， . . . ， "o. 

We are now ready to use a descent argument to complete the proof of the 
Mordell-Weil theorem. 

Recall from §3 that is a finite group. Let Qi, . . . , Q no be a set of 
representatives in E for this group. Thus, for any point P there is a j, 
1^7^ no, such that P + Qj = 2P f for some point P' E ： E. 
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Theorem 19.5.1. The group i^(Q) of Q-rational points on E is a finitely 
generated abelian group. 

Proof. Let P be an arbitrary point on E, rational over Q. Then P + ~ 

2P\ for some ai and P\. We have by the preceding lemma 

H{P X ) ^ C{H{2P X - QJV /2 = C(H(P)V /2 . 

Similarly, write P\ + Q a , = 2P 2 so that P = 2P\ - Q a{ = 4P 2 - 2Q a , - Q a[ , 
and H(P 2 )^ C(H(2P 2 - Q a2 )) i/2 - CH(P') V2 S C l + m H(Py /4 . Continuing 
in this manner we arrive at a sequence of points P r with 

H{P r ) ^ C 1 + 1/2+ - +1/2, 好 ( 尸 ) 1/(2,+l ) (14) 

and 

P - 2^P r - V^Q ar _••• — &,• (15) 

Now the right-hand side of (14) approaches C 2 as r approaches infinity, 
and therefore, there is an integer r 0 satisfying the condition that if r ^ r 0 
then H{P r ) ^ C 2 + 1. But by lemma 1 this last inequality is satisfied by 
only a finite set of points, say, P[, , P r So . Finally (15) shows that P 

may be written as a finite linear combination, with integer coefficients, of 

the points 01, - - • ，。，尸 I， • . • ，尸。 □ 

Notes 

The Mordell-Weil theorem and the arithmetic of elliptic curves in general 
have a long and rich history. In these notes we mention only a few salient 
points and refer the interested reader to the references at the end of this 
chapter. The literature in this subject is vast, and the references we have 
listed represent only a starting point for further study. 

Diophantus (circa 250) in Book 4 of his Arithmetica ([He], problem 24, 
p. 124) asks that a given integer be divided into two parts so that the 
product is the volume of a cube less its side. In geometric language this 
amounts to finding the rational points on the cubic y 3 — y = x(n - x). He 
illustrates the method by choosing n = 6 and, after an informed guess, 
puts y = 3x - 1. Substitution leads to a cubic with zero as a double root, 
and he computes the third root to be 26/27. In geometric language one 
observes that the above line is tangent to the cubic at (0, _ 1 )， and (26/27, 
136/27) is the third intersection point. Similarly, in problem 26 of Book 2 
two numbers are sought such that their product added to either is a cube. 
The propitious choice of Sx and x 2 — 1 leads to the problem of finding 
rational points on the cubic y 3 = (x 2 - l)(8x + 1), and in modern terms, 
Diophantus intersects the curve with the line y = 2x - 1. This line inter¬ 
sects the cubic at (0, -1) and infinity. He computes the third point to be 
(112/13, 27/109). All this is accomplished without the aid of present alge¬ 
braic notation and, of course, there is no indication of a geometric inter¬ 
pretation of the process, since he lived well over a thousand years before 
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the advent of analytic geometry. The use of the chord method to locate a 
third point given two points is less easy to find in the ancient literature, 
and as Weil points out in his historical study of number theory ([W 3], p. 
108), it is none other than Newton, who, in a paper written in the 1670s 
([N], vol. 4, pp. 112-115), states that, beginning with three noncollinear 
points, iteration of the chord process leads, in general, to infinitely many 
rational points. However, no examples are given. 

The method of descent is invariably associated with Fermat, who used 
it to show, among other things, that a positive square is not the difference 
of two fourth powers. This new point of view was to be contrasted with 
the generation of new points through the process of doubling, that is, by 
iterating the tangent process. The actual iteration of the tangent method 
seems to have been initiated by Fermat, who developed the techniques of 
Diophantus, Viete, Bachet, and others, yet he does not appear to have 
used the chord process or to have interpreted these methods geometri¬ 
cally (see [W 3], p. 110). 

The efforts of Fermat were continued a century later by Euler, who 
gave rigorous proofs of many, but not all of Fermat’s assertions. In this 
connection the scholarly studies of Hoffman [Hof] and Bashmakova [B] 
as well as Weil [W3] are particularly useful. Lagrange, whose interest in 
number theory was stimulated and encouraged by Euler, also utilized the 
method of descent. In his memoir of 1777, concerning the equation 2x 4 - 
y 4 = z 2 , Lagrange praises the method of Fermat, stating, “Le principe de 
la demonstration de Fermat est un des plus feconds dans toute la Theorie 
des nombres, et surtout dans celle des nombre entiers.” (The principle of 
Fermat’s proof is one the most fruitful in number theory, particularly over 
the integers.) 

In a long memoir, “Sur les proprietes arithmetiques des courbes alge- 
briques，” published in 1901 [P]，Poincare initiated a program (“•• . plu- 
tot un programme d’etude qu’une veritable theorie”）to study the arith¬ 
metic of algebraic curves over the rationals of any genus, emphasizing the 
birational point of view. The major portion of the paper deals with elliptic 
curves. Using a Weierstrass parameterization (“argument elliptique’’)，he 
shows how to generate subgroups of rational points on the curve (“for- 
mule 1，’’ p. 492) and states, “On peut se proposer de choisir les argu¬ 
ments ...de tel fagon que la formule (1) comprenne tous les points 
rationels de la cubique.” (One may propose to choose the arguments in 
such a way that all the rational points on the cubic are contained in the 
equation 1.) He defines the rank as the minimum number of ‘‘fundamental 
points’’ necessary to generate the group and asks, ‘‘Quelles valeurs peut- 
on attribuer au nombre entier que nous avons appele le rang d’une 
cubique rationelle?” (Which values are assumed by the integer we have 
called the rank of the cubic?) 

This is, of course, still an open question. Only curves of relatively low 
rank have been found. In 1982 Mestre [Mes] showed that there exists a 
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curve of rank at least 12 and that, assuming a variety of unsolved conjec¬ 
tures, it has exact rank 12. In an important survey of Zagier [Z], it is 
stated that Mestre also found examples of curves of rank as large as 14. 
Whether or not the rank is bounded for curves defined over the rational 
numbers is unsolved, although A. Neron in his annotations to Poincare’s 
paper states, ‘‘L’existence de cette borne est cependent consideree 
comme probable.” (The existence of such a bound seems, however, 
likely.) However, Zagier mentions in the survey that it is conjectured that 
all values can occur. Indeed Cassels ([Ca 3], p. 257), in his now classical 
survey of the arithmetic of elliptic curves, argues that the rank may well 
be unbounded but that examples of curves with large rank may be difficult 
to find since “an abelian variety can only have high rank if it is defined by 
equations with very large coefficients.” For example, in 1986 Kretschmer 
[Kr] proved that the the curve y 2 = x 3 -h ax 2 + bx where a = 12273038545 
and b = 2 10 .3 6 .17.19.23.29.31.37.41.43.53 has rank 10. We mention that A. 
Neron, in 1954, was able to prove that infinitely many elliptic curves of 
rank at least 11 exist. Kretschmer [Kr] gives a summary of the various 
results bounding the rank from below. The plausibility of the hypothesis 
that the rank is unbounded is also strengthened by the fact that in 1967 
Tate and Shafarevich [Sh-T] proved that the analogous conjecture for 
curves defined over a field of rational functions of one variable with 
coefficients in a finite field is true. 

Now the finiteness of the rank must be considered as part of Poincare's 
program. Indeed 16 years later Hurwitz [Hur], in a paper in which certain 
elliptic curves are constructed with rank 0 or 1 ， emphasized the conjec¬ 
tural status of Poincare's statement by stating, at the conclusion of his 
paper, ‘‘Wenn aber die Anzahl der rationale Punkte auf der Kurve un- 
endlich ist, so spricht a priori nichts dafur, dass auch dann immer endlich 
viele fundamentale Punkte vorhanden sind. Bis also dieses nicht bewiesen 
ist, sind die auf diese Annahme gegrundeten Bemerkungen von Poincare 
in seiner mehrfach zitierten Arbeit entsprechend zu modifizieren.” (If, 
however, the number of rational points is infinite, then it isn’t clear, a 
priori, that a finite basis exists. Until this is shown the remarks of Poin¬ 
care in his often cited article that are based on this assumption should be 
modified.) Five years later Poincare's intuition (or oversight) was vindi¬ 
cated with the 1922 publication Mordell [Ml] of the first proof that the 
rank is finite. Cassels has written an interesting analysis of Mordell’s 
paper ([Ca 2]) which should be studied by anyone interested in the history 
of this fundamental result. 

This proof and the subsequent proofs of this and its various generaliza¬ 
tions follow the same general strategy. First one shows that ElnE is finite 
and then a descent argument using properties of an appropriately defined 
height function completes the proof. The weak theorem can be prpved by 
constructing a nondegenerate pairing between ElnE and the galois group 
of the extension obtained by adjoining to k the coordinates of all points P, 
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algebraic over k, such that mP is rational. This can be shown to be a finite 
extension and the result follows. (See [L2] and [Si].) It should be men¬ 
tioned that the weak (or as Weil calls it, the “petit” ） Mordell-Weil theo¬ 
rem, namely, the finiteness of El nE, is discussed (for n - 3)in section 8 of 
Poincare’s memoir, using his “cubiques derivees.’’ As A. Chatelet men¬ 
tions in his annotations to Poincare^ memoir ([P], p. 546), this section is 
the basis of the proofs of Mordell and Weil. 

The chronology of Weil’s research into these matters is engagingly 
recorded by Weil himself in the annotations to his collected papers ([W2], 
vol. 1, pp. 524-526). After Mordell’s proof was brought to his attention by 
chance, he saw the possibility of using his own work to generalize the 
descent argument in Mordell to curves of arbitrary genus defined over an 
algebraic number field, the elliptic curve being replaced by the group of 
rational points on the Jacobian of the curve. This is accomplished in 
Weil’s thesis of 1928 ([Wl], pp. 11-45). With the development of the 
general theory of abelian varieties, due also to Weil, it became possible to 
extend the theorem to abelian varieties defined over a number field. (See 
Lang [L2], chapter 5, and the historical notes on pages 88-90.) 

In 1929 Weil published the short note that is presented, without the use 
of elliptic functions and with an interesting simplification due to Cassels 
([Ca2], pp. 31-34), in this chapter. Weil mentions that since his thesis 
would be difficult for some to read, it would perhaps be useful to publish a 
simplified proof for the case of genus 1. This proof avoids the use of his 
decomposition theorem. At the conclusion of the introduction to this 
paper Weil states ‘‘Je ne pretends pas que la demonstration qu’on va lire 
soit essentiallement differente de celle de Mordell: et je serai satisfait si 
j’ai contribue a mieux mettre en valeur les idees du mathematicien 
anglais.”（I am not suggesting that the proof below differs essentially from 
Mordell’s and I would be satisfied if I have contributed to a better under¬ 
standing of the ideas of this English mathematician.) This proof also ap¬ 
pears in Lang ([L2], pp. 101-105) and Mordell ([M2], chapter 16). 

In 1961 and 1970 J. Tate lectured on the arithmetic of elliptic curves at 
Haverford College. These excellent lectures, available for years in barely 
visible mimeograph form, became the basis for Husemoller's book on 
elliptic curves [Hus]. In a forthcoming book, Silverman and Tate ([Si-T)] 
have revised and expanded the Haverford Lectures, maintaining the ele¬ 
mentary nature of the original presentation. We also mention the delight¬ 
ful little book of Chowla [Cho], as well as one by Chalal [Cha], for ele¬ 
mentary treatments of the Mordell-Weil theorem. In Chowla, by 
assuming that the curve has three rational points of order 2 the proof 
simplifies and becomes, according to him, “nothing beyond the capacity 
of a ten year old.” In a more sophisticated direction we strongly recQm- 
mend the excellent text of Silverman [Si], especially Chapter 8. With this 
well-written text the interested reader can continue the study of arith¬ 
metic geometry at a more advanced level. Finally, we recommend the text 
on elliptic curves and modular forms by Koblitz [Ko]. In this approach to 
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the arithmetic of elliptic curves Koblitz focuses on the essential solution 
of a classical problem in number theory: the determination of those posi¬ 
tive square free integers that can be the area of a right triangle with 
rational sides. The solution depends on the arithmetic of the elliptic curve 
y 2 = x 3 — n 2 x. Further applications of elliptic curves are discussed in the 
survey of current results in Chapter 20, which also presents additional 
references. 
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Chapter 20 

New Progress in Arithmetic Geometry 


The decade of the eighties saw dramatic progress in 
the field of arithmetic geometry. Problems that were 
previously thought to be inaccessible by contemporary 
methods were in fact resolved. It is the purpose of this 
chapter to survey a portion of these dramatic develop¬ 
ments. 

The material covered falls into two parts. The first 
part discusses the resolution of the Mordell conjecture 
by Gerd Faltings in 1983. The second part summarizes 
new results by B. Gross, V. Kolyuagin ， K. Rubin，and 
D. Zagier，which deal with the conjecture of Birch and 
Swinnerton-Dyer that was discussed in Chapter 18. 

The resolution of the Mordell conjecture has an im¬ 
mediate application to Fermat's last theorem. In a 
less transparent manner, the progress on elliptic 
curves also has a surprising application to Fermat’s 
last theorem. Work of G. Frey, J.P. Serre，and K. 
Ribet can be combined to show that Fermat's last 
theorem follows from a standard conjecture ， the Ta~ 
niyama- Weil conjecture，about elliptic curves. 

Another surprising application of the progress in 
the theory of elliptic curves is the resolution of an old 
conjecture of C.F. Gauss on the class numbers of 
imaginary quadratic number fields. This comes about 
by combining work of D. Goldfeld with a theorem of 
Gross — Zagier，as we shall see. 

The material discussed in this chapter is mathemat¬ 
ically sophisticated. We give few proofs，and some of 
the definitions are not precise. Our goals are to sketch 
these new results and to inspire the reader to learn 
more by pursuing some of the references listed at the 
end of the chapter. 
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§1 The Mordell Conjecture 

In 1922 L.J. Mordell published a paper entitled “On the Rational Solu¬ 
tions of the Indeterminate Equation of Third and Fourth Degrees.” In the 
first part of the paper, he states and proves what is now referred to as the 
Mordell-Weil theorem for elliptic curves over Q. At the end of the paper, 
he discusses the situation for curves other than elliptic curves and conjec¬ 
tures that curves defined over Q that have genus greater than 1 can have 
only finitely many rational points. He further states that this is only a 
guess and that he has no real evidence or argument for its truth. This 
conjecture became known as the Mordell conjecture. Many papers were 
written proving that this or that curve had only finitely many rational 
points, but no very general result was forthcoming except for a famous 
theorem of C.L. Siegel (1929) on integral points on affine curves. This 
states that a curve of positive genus defined by F(x, y) = 0, where F(x, y) 
E Z[x, y], has only finitely many solutions in Z x Z. 

The Mordell conjecture was generalized a bit as the years went by to 
state that a curve C defined over a number field K and having genus 
greater than 1 has only finitely many points rational over K, i.e., that 
C(K) must be finite. Note that if this were true, then C(L) would be finite 
for every number field L containing K. It is remarkable that until 1983 
there was not a single example of a curve known to have this property. In 
that year G. Faltings created a sensation in the mathematical world by 
writing a relatively short paper that proved the generalized Mordell con¬ 
jecture and several other important number-theoretical conjectures all at 
once. His accomplishment was built on the work of many others. We do 
not intend to give a history here，but merely mention some of the names of 
people who did important work that was used by Faltings in his proof: S. 
Arekelov, H. Grauert ， Yu.I. Manin, A.N. Parshin ， I.N. Shafarevich ， L. 
Szpiro, J. Tate, and J.G. Zarhin. 

In our preceding discussion，the notion of the genus of a curve oc¬ 
curred several times. This is an important concept that arose originally in 
topology. It is now possible to give several definitions of the genus of a 
curve, all of which are equivalent. Let C/K be a curve defined over a 
field K. 

(a) Suppose K CC and that C is nonsingular. Then C(C) can be given the 
structure of a compact Riemann surface. Topologically, this is a torus 
with g holes. The number of holes is the genus of C. 

(b) Let //i(C(C), Z) be the first homology group of C(C) with coeffi¬ 
cients in Z. This is a free abelian group with 2g generators. The num¬ 
ber g is the genus of C. (This definition is just a precise version of 
part a.) 

(c) The holomorphic differentials on C, n^C^C ))， form a vector space 
over C of dimension g. The number g is the genus. 
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Although (a) and (b) are hard to adapt to a curve defined over an 
arbitrary field ，（ c) can be modified to apply in the general case. One 
defines algebraic differentials on a curve, and a holomorphic differential is 
one that has no pole in a purely algebraic sense. 

One more definition will enable us to compute the genus of a few 
concretely given curves. Let C be given by a homogeneous equation 
F(x, y, z) = 0, where F(x, y, z) E K[x, y, z]. We need the notion of an 
ordinary double point. This is a singularity of a mild type. Recall that P = 
(a, b, c) is a singular point of C if it is a zero of all three partial derivatives 
dF/dx, dF/dy ， and dF/dz. P is said to be an ordinary double point if it is a 
singular point and the matrix 

~d 2 Fldx 2 d 2 F/dxdy d 2 F/dxdz 

d 2 F/dydx d 2 FI dy 2 d 2 F/dydz 

_d 2 Fldzdx d 2 F/dzdy d 2 F/dz 2 _ 

has rank 2. A standard example is the point P - (0, 0, 1) on the curve 

y 2 z = x 3 + zx 2 . 

(d) Let C/K be defined by F(x, : y ， z) = 0 as above. Suppose that F has 
degree n and_that the only singularities in C(K) are ordinary double 
points (here K is the algebraic closure of K). Then the genus of C is 
given by (n - l)(n - 2)/2 - r，where r is the number of double points. 

A nonsingular conic has genus zero. Here n = 2 and r = 0. Recall that 
the problem of Pythagorean triples was equivalent to finding all rational 
solutions of x 2 + y 2 = z 2 , a nonsingular conic. Another example, perhaps 
less obvious, is the lemniscate which was studied by a succession of 
mathematicians — Fagnano, Bernoulli, Abel, and Gauss, among others. 
This curve, whose graph resembles a figure eight, is defined by (x 2 + 
y 2 ) 2 = (x 2 - y 2 )z 2 . It has degree 4 but there are three ordinary double 
points ，（ 0, 0, 1) ，（ 1 ， V^T ， 0 )， and (1, - V^T, 0). By (d) above we calculate 
the genus to be zero. 

A nonsingular cubic must have genus 1, again by using (d). If a non¬ 
singular cubic has a rational point over the field of definition, it is an 
elliptic curve. A singular cubic must have genus 0. 

Consider the Fermat curve defined by x n y n = z n . It is easily seen to 
be nonsingular. Thus, its genus is equal to (« — \){n -2)/2. If n = 2, the 
genus is 0; if n = 3, the genus is 1; and if n > 3, the genus is greater than 2. 
When n = 2 there are infinitely many solutions, as we have seen (Chapter 
17, §1). When n = 3 Euler showed there were no solutions in positive 
rational numbers (Chapter 17, §8). Fermat’s last theorem asserts there are 
no solutions in positive rational numbers for any n > 2. The Mordell 
conjecture implies that for all /z > 3 there are at most finitely many 
solutions in rational numbers. This is, of course, much weaker than Fer- 
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mat’s assertion, but it is remarkable nevertheless. (As of 1980 Fermat’s 
last theorem had been proved for all prime exponents less than 125,000. 
That bound has undoubtedly been pushed much further by now.) 

As a final example of an interesting family of curves, let us define a 
curve to be hyperelliptic if it is defined by an equation of the form y 2 z n ~ 2 = 
a 0 x n + a\x n ~ l z + • • • + a n z n ， where ^ 0, and the polynomial on the 
right-hand side of the equation is assumed not to have repeated roots. If 
« 二 3, we are again in the situation of a nonsingular cubic, so the genus is 
1. If /I > 3, the only singular point is the point at infinity (0, 1, 0). The 
singularity is worse than a double point, so that (d) no longer tells us the 
genus. We simply record the answer. If n is odd, the genus is (n — 1)/2; if 
n is even, the genus is (n - 2)/2. If the reader is familiar with Riemann 
surface theory, the easiest way to see this is to use the Riemann-Hurwitz 
formula as it applies to a branched covering of the Riemann sphere. 

The interesting feature of all this is that the genus, which is essentially 
a topological invariant, controls the diophantine properties of a curve. We 
have a threefold division. Let C be a curve defined over a number field K. 

If the genus is zero, then either C(K) is empty or C(K) is infinite. This 
result is due to Hurwitz and Hilbert. We have already seen that there are 
infinitely many Pythagorean triples. As for the lemniscate, it is possible to 
give a rational parameterization 

x = l — m 4 , y = 2m — 2m 5 , z = 1 + 6m 2 + m 4 . 

Every m E K gives rise to a rational point (x, y 9 z) on the lemniscate. 

If the genus is 1 ， then either C(K) is empty or C is an elliptic curve and 
consequently by the Mordell-Weil theorem C(K) is a finitely generated 
abelian group (which may be finite or infinite depending on C and K). 

If the genus is greater than 1， then we have Theorem 20.1.1. 

Theorem 20.1.1 (Faltings) . Let CIK be a curve of genus greater than 1 ， 
defined over a number field K. Then C(K) is finite. (See [Co-Sil], [B], [Fa- 
Wu], and [Maz].) 

We end our short survey of this topic by mentioning that Paul Vojta 
found a new proof of the Mordell conjecture in 1989. He was led to his 
proof by means of a beautiful analogy, which he uncovered between the 
theory of meromorphic functions in complex analysis (Nevanlinna the¬ 
ory) and the theory of heights in number theory. The proof is in the 
tradition of diophantine approximation, a topic we touched on briefly in 
§12 of Chapter 17. These new ideas are very powerful and point the way 
to generalizations of the Mordell conjecture to higher dimensional alge¬ 
braic varieties (see Vojta’s article in [Co-Sil] and [Lai]). Faltings [Fa] has 
built on Vojta’s ideas to prove a conjecture of Serge Lang that deals with 
subvarieties of abelian varieties. This is a significant advance since the 
Mordell conjecture is a corollary of Lang’s conjecture. 
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§2 Elliptic Curves 

In this section we review some facts about elliptic curves, which we have 
already discussed, and add some new material as well. 

An elliptic curve E, over any field K, may be defined by a Weierstrass 
equation of the form 

y 2 z + a\xyz + a^yz 2 二 x 3 + a 2 x 2 z + a 4 xz 2 + 

where the coefficients are in K. There is one point at infinity, i.e., when 
z = 0, namely (0, 1 ， 0). There is also a polynomial condition on the 
coefficients that ensures that E is nonsingular. When K is of characteristic 
different from 2 and 3, things are easier. In affine form E can be given by 
y 2 — + ax + b, where we require that A E = — 16(4a 3 + 21b 2 ) 4 1 0. A E is 

called the discriminant of E. 

As we saw in Chapter 19, the rational points on E, namely, E{K), can 
be made into an abelian group for which the point at infinity is the zero 
element. We denote this point as O. For any field L containing K, E(L) is 
also a group and one can inquire about its structure. We will review some 
of what is known about this. 

To begin, suppose 尺二 F is a finite field with q elements. Then E(F) is 
contained in P 2 (F), which is a set with q 2 + q + l elements. Thus, E(¥) is a 
finite group. Let N be the number of elements in E(¥). The congruence 
Riemann hypothesis implies that \N - q - 1| < 2 \fq. See Chapter 18, §2, 
for a discussion in the case F is a prime field. 

If 尺 is a number field, the Mordell-Weil theorem tells us that E(K) is a 
finitely generated group. There is another class of fields that behaves a lot 
like number fields. Let F(J) be a rational function field with coefficients in 
a finite field, and suppose 尺 is a finite extension of F(7). K is called an 
algebraic function field in one variable over a finite field. For such a field, 
one can show that once again E(K) is finitely generated. Later we will 
discuss a very recent application of this result to a problem in the geome¬ 
try of numbers. 

If 尺 =R, the real numbers, then E(U) is topologically either a circle or 
a disjoint union of two circles, the second case occurring when x 3 + ax + b 
has three real roots and the first when it doesn’t. Algebraically, either 
E(U) = T l or T ] x Z/2Z, where r 1 = E C| |^| = 1} is the unit circle in the 
complex plane. This fact has an immediate application to the structure of 
the torsion subgroup of E(Q). Since £(Q) C E(U), it follows that E(Q) tors is 
either cyclic or the direct sum of a cyclic group and Z/2Z. 

If 尺 =C，the complex numbers, then E(C) is topologically a torus, i.e., 
a compact surface with genus 1. Algebraically, £(C) is isomorphic to 
T 1 x T l . There is a better way to state this. On E there is a distinguished 
holomorphic differential dxly. Remember that the space of such differen¬ 
tials is one dimensional over C, so there is not much choice. If one 
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integrates dx/y over all closed paths on E(C )， the resulting set of complex 
numbers A forms a lattice in C, called the period lattice of E. (A lattice in a 
real vector space V is a subgroup consisting of all Z-linear combinations 
of a vector space basis of V.) One has E(C ) 二 C/A. One can make this 
map more explicit. Let P be any point on E(C) and y a path from O to P. 
Map P to the integral along y of dx/y. The resulting map is not well 
defined, but it is well defined modulo A. This yields the preceding isomor¬ 
phism. 

These considerations lead to a painless definition of the notion of com¬ 
plex multiplication in the case of an elliptic curve defined over a subfield 
of the complex numbers. To any such curve one associates a lattice A by 
the process we have just described. One then considers the set 0 = 
{z E C|zA C A}. 0 is a ring, as is easily seen. It always contains the 
integers Z, and it usually consists precisely of Z. When 6 is bigger than Z, 
we say that E has complex multiplication. This makes some sense since 
anything in 0 that is not in Z must be complex. To see this, let \\ and 入 2 be 
aZ basis of A, i.e • ，八 =ZXi + Z 入 2 . If co E 0, then coX/ = for / = 1 ， 2, 
and the a" E Z. Let 丁 = 入 2 /入 1 • Since 入！ and 入 2 generate C over IR, we must 
have that r is not real. Since co = au a^r and cor = a^\ + fl 22 丁 ， we see 
that r satisfies a quadratic equation with coefficients in Z. Thus, Q(r) is an 
imaginary quadratic number field. Moreover, w 2 - (a u + a 2 2 )(o + 
(a\\a 2 2 - CLnan) = 0, so co is an algebraic integer in O(r). We have shown 
that either 0 = Z or 0 is an order in an imaginary quadratic number field, 
i.e.，a subring of the ring of algebraic integers in Q(r) that generates Q(r) 
over Q. 

The curves we dealt with in Chapter 18 have complex multiplication. 
If y 2 = x 3 ~ Dx, it can be shown that there is a real number 入 such that 入 
and i\ generate the period lattice (here i — V-T). Thus, A = Z [/] 入 and 
0 = Z[/]，the ring of Gaussian integers. In the case of an elliptic curve 
defined by y 2 = jc 3 + D, it can be shown that there is a real number 入 
such that 入 and 入 generate the period lattice (here o> is a primitive 
cube root of 1). Thus, A = Z[o >] 入 and 0 = Z[co], the ring of Eisenstein 
integers. 

The notion of complex multiplication can be given a completely alge¬ 
braic definition. One has to define the notion of an algebraic endomor¬ 
phism of an algebraic group. Then, if E is an elliptic curve, one can define 
the ring End(E) of all algebraic endomorphisms of E. For example, if 
(x, y) is a point on y 2 = x 3 - Dx, we can define i(x, y) to be (-x, iy) and 
verify that this action gives an endomorphism on E(K). Similarly, 
co(x, y) = (cox, y) yields an algebraic endomorphism of y 2 = x 3 D. In 
general there are three possibilities for the structure of End(E); it is iso¬ 
morphic to Z, or to an order in an imaginary quadratic number field, or to 
an order in a quaternion algebra (the last can occur only in characteristic 
p + 0). If End(E) + Z, we say E has complex multiplication. We will not 
pursue these ideas further here. 



§3 Modular Curves 


345 
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It is impossible to fully appreciate some of the new developments in the 
theory of elliptic curves without some background in the theory of modu¬ 
lar curves. We will give a very brief introduction to these curves and their 
properties. 

One caveat before we begin. Up to now we have not been dealing with 
the most general notion of an algebraic curve. We have defined a curve as 
the solution set of a homogeneous polynomial in the projective plane. 
Curves also occur as one-dimensional subvarieties of higher dimensional 
projective spaces, and not every such curve “fits” into the plane. 
In §1 we wrote down a formula for the genus of a plane curve which 
showed that a nonsingular plane curve must have a genus of the form 
(n - \){n - 2)/2. It follows that, for example, a plane curve of genus 2 
must have singularities. But there are nonsingular genus 2 curves in P 3 . In 
what follows we use the word curve somewhat loosely but hope neverthe¬ 
less to convey a good idea of what is going on. 

Modular curves parameterize families of elliptic curves with certain 
extra structure. We begin by considering pairs (E ， P )，where E is an 
elliptic curve and 尸 is a point on E of order N. We say two pairs (E, P) and 
(E\ 尸 ’），are isomorphic if there is an algebraic isomorphism 4> from E to 
E f such that = P f . There is a curve Y\ (AO defined over Q whose 
points are in 1-to-l correspondence with isomorphism classes of pairs 
(E, P) of the type just described. Moreover, if (E, P) corresponds to a 
point in Y\(N)(K), where K is an extension of Q, then (E, P) is equivalent 
to a pair (E f y P r ), where E' is defined over K and P r E E{K). (See [La5] 
and [Shim].) 

The curve Fi(N) is not complete in a sense we will not make precise. 
To make it complete it is necessary to add a finite number of points 
called cusps. The resulting complete curve is called X\{N). It is possible 
to compute the genus of and it turns out that the genus is 0 if and 

only if l ^ N ^ 10 or N = 12. This fact is essential in the proof of Mazur’s 
theorem on the structure of E iors (Q), where E is any elliptic curve defined 
over Q (see Theorem 18.1.2). One big step in the proof is to show 
Y\(N)(Q) is empty if N is outside of the above range. For such N it 
follows that an elliptic curve over Q cannot have a rational point of order 
N. 

A second family of modular curves parameterize isomorphism classes 
of pairs of the form (E, C), where E is an elliptic curve and C is a cyclic 
subgroup of E of order N. As before, (E, C) and {E\ C f ) are said to be 
isomorphic if there is an algebraic isomorphism </> from E to E f such that 
cf)(C) = C r . There is an algebraic curve Y 0 (N) whose points are in 1-to-l 
correspondence with isomorphism classes of the pairs (E, C). If (E, C) 
corresponds to a point on Y 0 (N)(K), then (E, C) is equivalent to a pair 
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It is now not too hard to prove the following result. 


Proposition. F 0 (N)(C) - WY Q 、 N) and Y X {N){£) - WY { {N). Moreover, 
X 0 (N)(C) - C(r 0 (N)) andX x {N){Q - C(r,(A^)). 

In this proposition means “is analytically isomorphic to.” The 

group T is sometimes called the modular group. The proposition shows 
the connection between certain subgroups of the modular group and the 
modular curves we discussed earlier. 

A very readable introduction to the modular group and its properties is 
given by Serre [Se]. Subgroups of the modular group are discussed in 
[Ko] ， [La5], [Ogg], and [Shim]. 

We are now in a position to state one of the most important conjectures 
in the whole subject. 


(£"’ ， C'), where E f is defined over K and C' is also defined over (we say 
a subset of E{K) is defined over K is o-(5) = S for all automorphisms <j of 
K!K\ see [Shim]). 

Y 0 (N) is not complete and requires the addition of finitely many points 
(cusps) to make it into a complete curve X 0 (N). The genus can be com¬ 
puted, and one finds that the genus is 0 for 1 ^ ^ 10 and N = 12, 13, 16, 

18, 25. The genus is 1 for 11, 14, 15, 17, 19, 20,21, 24, 27, 32, 36, 49. 
Curves in this latter set are themselves elliptic curves. As we will see, the 
curves X 0 (N) form the key ingredient in the very important conjecture of 
Taniyama-Weil. 

It is interesting to see what these curves look like over the complex 
numbers. Let = {z E C \z = x iy, y > 0}, the classical upper half¬ 
plane. The group 5L(2, [R) of 2 x 2 matrices with coefficients in U and 
with determinant 1 acts on by fractional linear transformations. If A = 

d)' we define A(z) to be (az + b)/(cz + d). The discrete subgroup r = 

5X(2, Z) acts on 3^ in a properly discontinuous manner (definition omit¬ 
ted) and the quotient space 3^/r has the structure of a one-dimensional 
complex manifold. In fact, "KIT — C and so Wr can be compactified by 
adding one point to yield the Riemann sphere, which is isomorphic to 
P'(C). If r' is any subgroup of T of finite index, we can also form %/T' 
and by adding finitely many points in an appropriate manner compactify it 
to a compact Riemann surface CXP). The natural map WT f —> Wr ex¬ 
tends to an analytic map from C(T f ) to P'(C), which realizes C(T f ) as a 
branched covering of the Riemann sphere. 

Define two families of subgroups of finite index in T : 
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The Taniyama-Weil Conjecture. Let E be an elliptic curve defined 
over Q. Then there is an integer N and a nonconstant rational map 
4>\ Xo{N) —> E with cf) defined over Q. 


If E/Q is the algebraic image of some Xo(N), we say that E is modular. 
The conjecture may be paraphrased as saying that every elliptic curve 
over Q is modular. 

This conjecture was first put forward by Taniyama at a conference on 
algebraic number theory held in Japan in 1955. In 1968 Weil [We] refined 
the conjecture by specifying that the integer N can be taken to be the 
conductor of E, a notion to be discussed later, and also proved an impor¬ 
tant theorem that made the conjecture very plausible. In 1971 G. Shimura 
proved, using Weil’s theorem, that every elliptic curve over Q that has 
complex multiplication is modular. There is a finite algorithm that allows 
one to check in any given case if an elliptic curve over Q is modular. This 
has been done in hundreds, perhaps by now thousands, of cases. The 
evidence in its favor seems overwhelming. 

This conjecture seems to have nothing at all to do with Fermat’s fa¬ 
mous conjecture, his so-called last theorem. Nevertheless, the mathema¬ 
tician G. Frey discovered a connection. If a p + b p = c p is a solution in 
positive integers a, b, and c, where p is a prime different from 2, Frey 
associates to such a solution the elliptic curve E: y 2 = x(x - a p )(x + b p ). 
He then shows this curve has such remarkable properties that it shouldn’t 
exist. J.-P. Serre had previously formulated a conjecture about modular 
functions that would prove this nonexistence if E were modular. K. Ribet 
in 1986 proved a special case of Serre’s conjecture that was powerful 
enough to yield the following theorem. 

Theorem (Frey, Serre, Ribet). The Taniyama-Weil conjecture implies 
Fermat’s last theorem. 

This, together with the results of Faltings discussed in §1, represents 
truly amazing and unexpected progress toward a resolution of Fermat’s 
last theorem. Oesterle discusses this theorem and gives a sketch of the 
proof in [Oesl]. 


§4 Heights and the Height Regulator 

The theory of heights plays a very important role in the subject of 
diophantine equations. As we saw in Chapter 19, it is a key ingredient in 
the proof of the Mordell-Weil theorem. In this section we briefly intro¬ 
duce the more general theory. One of our principal motivations is to give a 
definition and discussion of the height regulator, which is an important 
quantity associated with an elliptic curve defined over a number field. The 



348 


20 New Progress in Arithmetic Geometry 


height regulator also plays a role in the more refined version of the conjec¬ 
ture of Birch and Swinnerton-Dyer. 

Let us recall the definition of the height of a rational number. If r E Q, 
write r = alb ， where a and b are relatively prime integers. Define H(r)= 
max (|«|, \b\). This has the following two properties; H{r) > 1 for all 
r E Q, and for every C the set {r E Q|//(r) < C} is finite. We would like to 
extend H to a function on all of Q, the algebraic closure of Q, in such a 
way that both these properties continue to hold. This turns out to be 
almost possible. The construction of such an extension is not too hard, 
but it would take us too far afield to give all the details here. We will show 
how to extend // to a function on all algebraic integers and refer the 
interested reader to some of the references given at the end of this chapter 
for the method in the general case of algebraic numbers (see [Sil], [Hu], 
[La3], or [La4]). 

Su£pose a e [ is an algebraic integer in some algebraic number field 
K C Q. Let o-j, (t 2 , . . . , a n be the imbeddings of K into the complex 
numbers, arranged in such a way that the first s of them are real imbed¬ 
dings, the next t of them are distinct nonconjugate complex imbeddings, 
and a s+i is the complex conjugate of (T s+t+i for 1 < / < We then define 
the normalized absolute values as follows: 

|a:||/ = |cr/a| if 1 < / < ^ 

a i = ot/o: 2 if s \ ^ i ^ s + t. 


Definition. Let a be an algebraic integer in an algebraic number field K of 
degree n over Q. The height of a is defined by 

H(a) n = 11/ max (1, ||a||/). 

It is not hard to check that H(a) is well defined and that if a E Z, H(a) 
reduces to max (1, \a\) as it should. 


Proposition 20.4.1. For all a E Q, H(a) > 1. Moreover，if C and n are 
given, the set {a E Q \H(a) < C and deg(a) < n} is finite. 


Proof. We cannot give the proof of the full result since we have defined 
H{a) only in the case a is an algebraic integer. We give the proof in this 
special case and remark that the proof of the general result is quite sim¬ 
ilar. 

The first assertion is clear from the definition. Now assume a is an 


algebraic integer and that d = deg(a) 二 [Q(a):Q] < n. Then a satisfies a 
monic polynomial equation of degree d with coefficients in Z: x d + a\x d ^ x 
+ • • ' a d - {x - o ： i)U — 0 C 2 ) - - - (x - a d ). From the definition of 

height, we find that \ai\ < C for all /, and it follows that \ai\ < ^ O for 


\ < i ^ d. Since d is bounded and the coefficients of the polynomial are 
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bounded, there are only finitely many possible polynomials involved, and 
thus a must be one of only finitely many algebraic integers. □ 

Let EIK be an elliptic curve defined over an algebraic number field, 
and suppose it is given in affine form by a Weierstrass equation y 2 = 
jc 3 + «jc + b, with a, b E K.lf P E E(K), write P = (x(P), y ( 尸 ))• As usual, 
denote by O the point at infinity on E. 

Definition. The height on £"( 幻 is a function h: E(K) U given by h(O)= 
0, and h{P) = log H(x{P)) for P + O. 

The “log” on the definition denotes the natural logarithm. As will be 
seen, passing to the logarithm of H has a number of advantages. Note that 
for all P, h(P) > 0. Also, since —(x, y) = (jc, -y), it follows that h(P)= 
h(—P). The following simple consequence of Proposition 20.4.1 will be 
important. 


Proposition 20.4.2. Let EIK be an elliptic curve defined over an algebraic 
number field K. For all C, the set {P E E(K) | h{P) < C} is finite. 

Proof. By Proposition 20.4.1，the set (a ： E K\H(a) < e 6 } is finite. Since 
for each a E K there are at most two values of p such that (a, (S) E E(K )， 
the result follows. □ 

Before going farther, we introduce some useful notation. If / and g are 
functions from some set X to U, we define f{x) - g(x) + 0(1) to mean that 
\f(x) - 尽 (x)| is bounded above by a constant that may depend on/and g. 
Similarly,/U) < + 0(1) means that there is a constant C such that 

f(x) < + C for all x ^ X. 

In Chapter 19, §4, two important properties of height on elliptic curves 
defined over Q were proved. In the present context, these may be reform¬ 
ulated as follows: For P E E(Q), 4h(P) < h(2P) + 0(1), and if Q E E(Q) 
is fixed, then h(P + 0) < 2h(P) + 0(1). The first follows from equation 
(12), and the second is derived from equation (8) (it is not hard to see that 
2P can be replaced with P in equation (8), and one can then replace 尸 with 
P + Q). We now present an important generalization. 


Proposition 20.4.3. Let EIK be an elliptic curve defined over a number 
field K. For all P, Q E ： E(K) we have 

h(P + 0 + h(P ~ Q) = 2h(P) + 2h(Q) + 0(1). 

Proof. We sketch the proof, referring to Silverman [Sil] for details. 
Assuming that K = Q, one can use the methods of Chapter 19 to establish 
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that h(P + 0 + h(P - 2) — 2/z ( 尸 ） + 2h(Q) + 0(1). The problem is to 
show the reverse inequality. Set P = R -\- S and Q = R — S. One finds 
h(2R) + h{2S) < 2h(R + 5) + 2h(R - 5) + 0(1). By previous results, we 
know that 4h(R) + 4h(S) ^ h(2R) + h(2S) + 0(1). Combining these 
inequalities and dividing by 2 yields the result. □ 


If we set P = Q, the relation h(2P) = 4h(P) + 0(1) falls right out. It is 
in fact not hard to show that h(mP) = m 2 h(P) + 0(1) for all integers m. In 
Proposition 20.4.3 substitute mP for P and P for Q. Assuming the result 
for integers k such that 1 < /: < m, we find that 

h((m + 1) 尸 ) + (m — l) 2 h(P) = 2m 2 h(P) + 2h(P) + 0(1). 

Thus, h((m + 1) 尸 ） =(m + l) 2 h(P) + 0(1), and so we are done by 
induction. 

All of this makes it plain that the height function on an elliptic curve 
behaves very much like a quadratic form, aside from the 0(1) terms. Both 
A. Neron and J. Tate found ways to modify the definition so as to get a 
quadratic form on E(K), which behaves like the height function. Both 
methods have advantages, but we will present Tate’s because it is more 
elementary. 


Definition. Let ElK be an elliptic curve defined over an algebraic number 
field K. For P E E(K) define h(P), the canonical height of 尸 ， by the 
formula li(P) = lim 4~ n h(2 n P). 

n 一 r 

To show the limit in this definition exists, it is sufficient to show that 
the terms define a Cauchy sequence. Suppose n > m ^ 0. Then 

\4~ m h(2 m P) - 4~ n h(2 n P)\ < 2 - 4— 卜 1 /?^爪 +/+1 尸)|， （1) 

where the sum is from i = 0 to n - m — 1 . There is a constant C such that 
|4 _, /z(2g) — h(Q)\ < C for all Q E E(K). The /th term in the sum is 
Thus, the sum is dominated by (4—+ 4— 1 + • • • + 

4 -" +l)c 

< 4~ m+] C. This shows the terms form a Cauchy sequence. 

The important properties of the canonical height are summarized in the 
following theorem. 


Theorem 20.4.4. The canonical height fi(P) satisfies 

(i) fi{P) = h{P) + 0(1). ^ 

(ii) 〈尸， Q) = 1/2 (fi(P + Q) ~ /i(P) — fi{Q)) is bi-additive. 

(iii) fi{mP) = m 2 li(P) for all m E Z. 

(iv) h(P) > 0 , with equality holding if and only if P is a torsion point. 

(v) If g(P) is any function satisfying (/) and (///), then g = h. 
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Proof. We sketch the proof, referring the reader to the references for the 
details (we note parenthetically that the canonical height in [Sil] is half the 
one defined here). 

In equation (1) set m = 0 and take the limit of both sides as n tends to 
infinity. Since we have shown that the right-hand side is dominated by 4C, 
it follows that \h(P) - ^ 4C, which proves ⑴. 

In Proposition 20.4.3, replace 尸 and Q by 2 n P and 2 n Q, respectively, 
divide through by 4 n , and pass to the limit as n tends to infinity. The result 
is that the canonical height satisfies the parallelogram law: li(P + Q) + 
li(P — Q) = 2li(P) + 2/i(Q). Property (ii) follows from this identity by an 
exercise in pure algebra. We omit the details. 

Property (iii) can be derived in two ways. One can start with the fact 
that the height function h satisfies the property up to 0(1) terms and then 
get the result by replacing P by 2 n P, dividing by 4"，and passing to the 
limit. Alternatively, it follows by a formal induction using property (ii). 

Since h(P) > 0 for all points 尸 ， it follows that the same is true of h(P). 
If P is a, torsion point, there is an m E Z, m ^ 0, such that mP = O. Thus, 
0 二 h(0) = h(mP) - and so fi(P) = 0. Conversely, suppose 

■A _ 八 

h(P) = 0. Then, by property (iii), h(mP) = 0 for all integers m. However, 
using property (i) and Proposition 20.4.1, we see that {mP | m E Z} is a 
finite set. This can happen only if 尸 is a torsion point. 

Finally, assume that g(P) satisfies (i) and (iii). From (i) we see that 
there is a constant C such that \ fi(P) - ^(P)! < C for all points 尸 . Choose 
any m > 1, and replace P by m k P. Then, using property (iii), we find 
- g ( 尸 ）| s Cm— lk . Now let k tend to infinity. The result is 

vv 

h(P) = g(P)- (Note that one only has to assume that (iii) holds for one 
integer m > 2). 匚 

Definition. Let EIK be an elliptic curve defined over a number field K. 
Let Pj, P 2 , . . . ，/ be a basis for the free part of E(K), i.e., every point 
of E(K) can be uniquely written as the sum of a torsion point and a Z- 
linear combination of the Pi. Let 91 be the matrix whose ijth entry is 
(Pi, Pj). Then R(E/K), the height regulator of EIK, is defined to be the 
determinant of 2ft. 

Just as is the case with the regulator of a number field, the height 
regulator of an elliptic curve has a geometric interpretation. To get an idea 
of how this works, we have to introduce the real vector space V(K) = 
R ® E(K), which has dimension r over U. For those readers who are 
unfamiliar with tensor products, it is possible to give a more concrete 
construction of V(K). Its points consist of formal R-linear combinations 
of the Pi, i.e., expressions of the form 2 又 , P, with the Xi E U. Addition and 
subtraction are performed coordinate wise, scalar multiplication by the 
rule fLXiPi —言 tXiPi. The height pairing (P, Q) can be extended to V{K) 
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in the obvious manner; if A" = Sx/P/ and Y = 尸,， then (X, Y)= 

S ij Xiyj(Pi, Pj). What is not as obvious as it seems is that this extended 
inner product is positive definite on V{K). It is true. The proof involves 
the almost all of Theorem 20.4.4. Assuming this result, choose an 
orthonormal basis ei, e 2 , . . ” e r for V{K). Now put the usual measure on 
the Euclidean space V{K) so that the unit cube {S t^i | 0 < ^ < 1 for / = 1, 
2, . . . , r} has volume equal to 1. 

We define a map </> from E(K) to V(K) as follows: Every P E E(K) can 
be uniquely written as 7 + where Tis a torsion point, and the nt E 

Z (the sum here is not formal; it is addition on the elliptic curve E). Define 
4>(P) = ^riiPi E V(K). It is easy to see that </> is a homomorphism with 
kernel equal to E tors (K) and with image a lattice in V(K). A fundamental 
domain for this lattice is given by {2 tiPi | 0 ^ ^ 1 for / = 1, 2, 

...,r}. To compute the volume of this fundamental domain is a stan¬ 
dard exercise in linear algebra. One writes Pj = 2^-ey with the a" G (R and 
the volume in question is equal to |det[〜]|. Let sd =[〜]• Since the e, are 
orthonormal, one sees that S/l = sH is the transpose of si) and it 
follows that R(E/K) = (det si) 2 . We have proved Proposition 20.4.5. 

Proposition 20.4.5. The height regulator, R(E/K), is the square of the 
volume of a fundamental domain for the lattice 4>(E(K)) in the vector 
space V(K). 

Using this geometric interpretation and some standard arguments from 
the geometry of numbers, we can deduce a very interesting result about 
the distribution of rational points on an elliptic curve. 

Theorem 20.4.6. Let El K be an elliptic curve defined over a number field 
K. Suppose that E(K) has rank r. Let N(R) be the number of elements 
P E E(K) such that h(P) < R. Then there is a constant C such that 
N(R) 〜 CR r/2 (here the “〜” means that the ratio of the two sides tends 
to \ as R tends to infinity). More precisely, the constant C is equal to 
y r \E(K) t \/vR(E/K), where y r is the volume of the unit sphere in Euclid¬ 
ean r-space (y r = 77 r/2 /r(l + r/2)). 

Proof. Let L be a lattice in Euclidean ^z-space, R n . Then the number of 
elements in the set {入 E || 入 || $ /?} is asymptotic to the volume of the 
sphere of radius R divided by the volume of a fundamental domain for the 
lattice L. This is a standard result that is intuitively clear and not too hard 
to establish. We now apply it to the lattice in V(K). 

八 First, notice that U(P)\\ 2 = 〈 </>( 尸 ) ， (f>(P)) = 1/2 (lt(2P) - = 

and by Theorem 20.4.4, part (i), h(P) differs from h(P) by a 
bounded amount. Thus, N(R) differs from the product of ( 幻 tors | and 
the number of points in the set {P E E(K) \ ||</>(/ > )|| < R 1/2 } by a bounded 
amount (^(^)^! enters into this because it is equal to the order of the 
kernel of </>). The number of elements in the latter set is asymptotic to 
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y r R r/2 divided by the volume of the lattice which is, by Proposi¬ 

tion 20.4.5, the square root of R(E/K). The proposition follows. □ 

There are many interesting open problems concerning the canonical 
height. For example, here is a conjecture due to Serge Lang (see [La3], 
p. 92). Let E be an elliptic curve defined by y 2 = x 3 + ax b with a and b 
in Z. Assume E(Q) has positive rank. Since 4>(E(Q)) is a lattice in V(Q), 
there is a nontorsion point P { E E (Q) such that lt(P\) is least. Lang 
conjectures that there is a constant C independent of E such that h{P\) > 
C log |A^|, where K E is the minimal discriminant of This number divides 
the discriminant of E. (For a precise definition, see page 224 of [Sil].) This 
conjecture has been proved by J. Silverman [Sil2] in special cases. For 
example, let j E be defined by 1728(4a) 3 /A^. Silverman has shown that 
Lang’s conjecture is true if one considers only elliptic curves E such that 
j E is an integer (this holds automatically if E/Q is an elliptic curve with 
complex multiplication). As a result, he is able to prove [Sil3] that there is 

a constant C such that for all curves E with j E GZ, \E(Z)\ < d where r E 
is the rank of E(Q). Notice that this shows that if you could find elliptic 

curves with j E E Z and many integral points, you would force the rank to 
be large. It is conjectured (also by Lang; [La3], page 140) that inequalities 
of this type hold without restriction on j E , and also appropriately formu¬ 
lated, over any number field. (For more conjectures about the canonical 
heights of elements of a basis for E(K )，see [La6].) 

We end this section by noting that Noam Elkies used some of the ideas 
discussed in this section to provide examples of lattices in Euclidean 
space with extraordinarily good sphere-packing properties. Instead of 
number fields, he works over rational function fields F(7), where F is a 
finite field. By choosing the elliptic curve E and the finite field F very 
carefully, he is able to produce lattices that equal or better the best known 
examples, at least in all dimensions less than or equal to 1024. Once again, 
this illustrates the fact that the arithmetic theory of elliptic curves has 
deep and surprising applications in other areas of mathematics. 


§5 New Results on the Birch-Swinnerton-Dyer 
Conjecture 

In this section we begin by reviewing the definitions that go into the 
Birch-Swinnerton-Dyer conjecture. The discussion is similar to that of §2 
of Chapter 18. Here we work over a general number field K and also make 
the conjecture more precise. 

Let E be defined by an equation y 2 = x 3 ax b with coefficients in 
0 K , the ring of integers in K. Let be a prime ideal in 0 K and let - 1 
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be the number of solutions to the congruence y 2 = x 3 + ax b (mod 2^). 
Let = IOk/^I and define + 1 — N 沙 . 

Definition. L^(E/K, s) = 11(1 - CgpN < 3 i ~ s + Nf — 2 0 _1 ，where the product 
is over all nonzero prime ideals of O k not dividing A^. By multiplying by 
suitable factors at the primes dividing one arrives at L(E/K, s), the 
L-function of E over K. 

The idea of considering L(ElK, s) is that, since it contains information 
about all of the foregoing congruences, it should contain a lot of informa¬ 
tion about the arithmetic of E. 

First a word about the convergence of L(E/K,s). If 2^ does not divide 
A £ , then it can be proved that \Cg>\ < 2(N2^) ,/2 . This was proved by Hasse 
in the 1930s. Examples of this phenomenon go back to Gauss. Here we 
are concerned with elliptic curves over finite fields. As we noted earlier in 
this book, Weil conjectured similar results for nonsingular algebraic vari¬ 
eties over finite fields and proved his conjectures in several important 
cases. The general Weil conjecture, the congruence Riemann hypothesis, 
was proved by Deligne in 1973. 

To get back to our story, the inequality |C^| ^ 2(fW) 1/2 easily implies 
that L(E/K,s) converges for Rq(s) > 3/2. Another conjecture of Weil, 
closely related to the Taniyama—Weil conjecture, is the following: 


Conjecture (Weil). L(E/K,s) can be analytically continued to the entire 
complex plane and satisfies a functional equation. 

For simplicity we state the conjectured functional equation in the spe¬ 
cial case when E is defined over Q. There is an integer N E , called the 
conductor of E. N E divides A E and is divisible only by primes where E has 
“bad” reduction. We omit the precise definition. Let 

A e (s) = A^ /2 (2t7)-T ⑴ L(£7CM). 

Then (conjecturally) A^) can be analytically continued to an entire func¬ 
tion on all of C, and A^(5) = eA^(2 - s), where s = 1 or —1 is called the 
sign of the functional equation. 

Weil proved this in special cases. In 1954 Deuring proved it when E has 
complex multiplication. Eichler (1954) and Shimura (1958) proved it when 
E/Q is a modular elliptic curve. Thus, if the Taniyama-Weil conjecture is 
correct, the preceding conjecture would follow over Q. In any case, if E 
has complex multiplication, or more generally if E is modular, one can 
consider L(E/K,s) as an analytic function around the point 5=1. 


The Conjecture of Birch and Swinnerton-Dyer. Assuming the analytic 
continuation, L(E/K,s) has a zero of order r, the Z-rank of E(K), at 5 = 1. 
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Moreover, (s — \)~ r L(E/K,s) M E as 5 1, where M E is a constant with 

the structure 

m e = 2 ， |Ad- 1/2 |in (£/ 幻 幻阁幻 tors |_ 2 n4. 

Here t is half the number of complex embeddings of K over Q, D K is the 
discriminant of K, 1U_(E/K) is the Tate-Shaferevich group of E over K y 
R(E/K) is the canonical height regulator of E(K) (described in the last 
section), and the are numbers that are 1 unless 2^ is a prime of bad 
reduction or an archimedean prime. If 2^ is nonarchimedean, dg> is a 
positive integer; if 2Ms archimedean, then is given by a period integral. 

lli(E/K) is a very important group associated to E. It arises in connec¬ 
tion with the problem of computing the rank of a given elliptic curve. The 
definition is somewhat technical, and we shall not give it here (see, e.g., 
Chapter 10, §4 of [Sil]). lli(E/K) is conjectured to be finite, but until 
recently this was not known to be true for any single case. If one could 
find an effective upper bound for |lii| a consequence would be the exis¬ 
tence of a finite algorithm for determining the rank of E(K) in any given 
case. Around 1972 Tate made the following comment on the Birch-Swin- 
nerton-Dyer conjecture: “This remarkable conjecture relates the behav¬ 
ior of a function L where it is not known to be defined, to the order of a 
group ill not known to be finite.” 

There has been dramatic progress on this conjecture in recent years. 
Until further notice we will assume that E is defined over Q. If E has 
complex multiplication, we will say that E has CM. 

Coates-Wiles (1977). If E has CM, then L(E/Q,l) + 0 implies E(Q) is 
finite [114]. 


Gross-Zagier (1986). If £ is modular and L(E/Q,s) has a simple zero at 
5=1, then E(Q) is infinite [Gr-Za]. 

Rubin (1987). 

(a) If E has CM and L(E/Q,\) + 0, then Jii(£7Q) is finite. 

(b) If ^ has CM and r E > 2, then L(E/Q,s) has a zero at 5 = 1 of order 2 or 
greater. [Rul]. 


Rubin’s result (a) gave the first known examples of ILL being finite. For 
example, for y 2 = x 3 - x. Ill is trivial; for the curve v 2 = x 3 + 17x, ill = 
Z/2Z ㊉ Z/2Z. 

Combining the preceding results leads to Theorem 20.5.1. 


Theorem 20.5.1. If E has CM and ord 5=1J L(£7CM) = p E ^ 1, then pE = r E- 

This rather spectacular result was pushed much further by V.A. 
Kolyvagin in 1988. It turns out that the theorem remains true when the 
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hypothesis that E has CM is replaced by the much weaker hypothesis that 
E is modular. According to the Taniyama-Weil conjecture, this covers all 
elliptic curves defined over Q. To explain this work, it is necessary to 
develop in more detail what it is that Gross and Zagier were able to prove, 
and to do this one must define the notion of a Heegner point on the 
modular curve X 0 (N). 

Recall that the points on the modular curve Xo(N) correspond to iso¬ 
morphism classes of pairs (E, C), where C is a cyclic subgroup of E of 
order N. Let K be an imaginary quadratic number field with discriminant 
D < 0, and assume (N,D) = 1. We further assume that every rational 
prime p dividing N splits in O k , i.e.,pO K = 9^2. From this assumption it 
is not too hard to show that there exist ideals X C 0 K such that 0 K /X = 
Z/NZ. Consider the pair (C/O k , X~ ] /O k ), where X' 1 is the fractional 
ideal inverse to X in K. CIO K defines an elliptic curve over C, and 
J{~ l /0 K = O k IM = Z/NZ is a cyclic subgroup of order N. Thus, we have 
defined a point x K on Z 0 (7V)(C). It is a fact that this point has coordinates 
in H, the Hilbert class field of K. Recall that H is the maximal unramified 
extension of K whose Galois group, Gal(/// 尺 )， is abelian. The point x K is 
called a Heegner point in honor of Kurt Heegner, who first defined such 
points and investigated their properties. 

Now suppose that (p: Xo(N) E is a, modular parameterization of an 
elliptic curve E defined over Q. If x K is a Heegner point, define y K = 
S (p{x K Y, where the sum is over all automorphisms in Gal(///A^), the sum 
denotes group addition on E. Clearly, y K s E(K). The first part of the 
following result was conjectured earlier, around 1983, by Birch and 
Stephens. 

Kolyvagin (1988). Assume y K has infinite order in E(K). Then 

(a) The group E(K) has rank 1. 

(b) The group lli(E/K) is finite. 

Of course, we are ultimately interested in E(Q) and illCE/Q). By combin¬ 
ing Kolyvagin’s theorem with the work of Gross-Zagier and some ana¬ 
lytic results (to be discussed later) we can deduce the following theorem. 

Theorem 20.5.2. Suppose E/Q is a modular elliptic curve. Then 

(a) If L(E/Q ， s) has a simple zero at s = 1, then E(Q) has rank 1, and 
ill (E/Q) is finite. 

(b) If L(E/Q,\) + 0, then E(Q) is finite ， and _LLL(^/Q) is finite. 

The deduction of this theorem from the theorem of Kolyvagin is quite 
difficult. We will just sketch some of the ideas involved. 

The first step is to connect Heegner points with the theory of L-func- 
tions. That such a connection should exist was also conjectured by B.J. 
Birch and N.M. Stephens. 
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Theorem 20.5.3 (Gross and Zagier). Let E/Q be a modular elliptic curve, 
and ip: Xq(N) —> E be a modular parameterization. Let D < 0 be the 
discriminant of an imaginary quadratic number field K. Assume {N ， D ) 二 
1 and that every rational prime p dividing N splits in K. Then L r (E/ K ， l)= 
C where C is a nonzero constant {which can be explicitly given) 

and h is the canonical height on E(K). 


It follows from this theorem and the properties of the canonical height 
discussed in §4 that y K has infinite order if and only if L f (E/K,\) ^ 0, 
which brings in the L-function. Now we want to relate L(E/K,s) to 
L(E/Q,s). To do this it is necessary to define the quadratic twist of an 
elliptic curve (see [Sil], Chap. 10, §5). 

Suppose E is defined by y 2 = x 3 ax b. For D g /, D / 0, we define 
E d , the quadratic twist of E by D, by the equation Dy 2 = x 3 + ax + b. E D 
is again an elliptic curve over Q, and it is not too hard to prove the 
following proposition. 


Proposition 20.5.4. Let K be a quadratic number field with discriminant 
D，and E an elliptic curve over Q. Then 

(a) rank E(K) = rank £(Q) + rank E D (Q) and 

(b) L(E/K,s) = L(E/Q,s) L(E D /Q,s). 

We are now in a position to sketch the proof of Theorem 20.5.2. Let’s 
consider part (a). The assumption is that L(E/Q, s) has a simple zero at 
s = l, i.e., L(E/Q, 1) = 0 and L f (E/Q, 1) ^ 0. From Proposition 20.5.4, 
part (b), we find that L f {E/K, 1) = L f (E/Q, 1) L(E D /Q, 1). By a theorem of 
J.L. Waldspurger, there exist infinitely many fundamental discriminants 
D < 0 that satisfy the hypotheses of Theorem 20.5.3 and such that 
L(E d /Q, 1) ^ 0. For such a Z) we must have L r (ElK, 1) + 0, and so by 
Theorems 20.5.3 and 20.4.4, the point y K s E(K) has infinite order. By 
Kolyvagin’s theorem this implies rank E(K) = 1, and lli(E/K) is finite. By 
Proposition 20.5.4, part (a), either E(Q) has rank 1， or £ D (Q) has rank 1. 
Let bar denote complex conjugation. With the assumptions we have made 
it can be shown that yK ~ yK- Thus, ly K = yK 7k ^ E(Q), and we 
conclude that E(Q) has rank 1， as claimed. The fact that I1L(£7Q) is finite 
follows easily from the fact that 11L(E/K) is finite (provided that one knows 
the definition of either, of course). 

To prove Theorem 20.5.2, part (b), we can use similar reasoning. The 
main difficulty remaining is to show the existence of discriminants D of 
the type we need which satisfy L’(E D ， 1) + 0. The existence of infinitely 
many such discriminants was shown by D. Bump, S. Friedberg, and J. 
Hoffstein in 1989 [Bu-Fr-Hof]. Independently, and at about the same 
time, this result was also obtained by M.R. Murty and V.K. Murty 
[Mur-Mur]. 
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Surprisingly, these beautiful new results in the arithmetic theory of 
elliptic curves have led to the resolution of an old problem of Gauss on the 
class numbers of imaginary quadratic number fields. This is the topic of 
the next, and final, section of this chapter. 


§6 Applications to Gauss’s Class 
Number Conjecture 

A large percentage of Gauss’s number theoretic masterpiece, Disquisi- 
tiones Arithmeticae [136], is taken up with the theory of binary quadratic 
forms. In Article 303 of that work he describes the results of extensive 
calculations of class numbers of definite quadratic forms. These calcula¬ 
tions can be reinterpreted as calculations of class numbers of imaginary 
quadratic number fields. If D < 0 is the discriminant of such a field, let 
h(D) denote its class number. Gauss observed that apparently h(D) oo 
as |D| -> oo. In fact, the last D for which h{D) = 1 seemed to be - 163, the 
last for which h{D) = 2 seemed to be -427, and the last for which h(D) = 3 
seemed to be -907 (Gauss uses a somewhat different normalization for 
class numbers and so his values are different from these). These observa¬ 
tions led to two problems. First, prove the assertion that h(D) — oo as 
\D\ oo. Second, prove an effective version of the same result, namely, 
for every integer n, produce an integer C(n) such that if |D| > C(n), 
h{D) > n. One would hope that the constants C{n) would be small enough 
to show that Gauss succeeded in finding all imaginary quadratic number 
fields of class number 1, 2, and 3. 

The first problem was solved affirmatively in the 1930s by the com¬ 
bined efforts of several mathematicians. The story is amusing and is con¬ 
nected with the Riemann hypothesis, so we pause to recall what that is 
about. 

Let ^(s) = S n~ s denote the zeta function of Riemann. ((5)，as was 
proved in Chapter 16, can be analytically continued to the whole complex 
plane and is holomorphic everywhere except for a simple pole at 5 = 1. 
Riemann conjectured that the only zeros of ^( 5 ) in the strip 0 ^ Re(^) < 1 
are on the line Re(x) = 1/2. This assertion is known as the Riemann 
hypothesis and is one of the most famous unsolved conjectures in all of 
mathematics. There is a generalization of this assertion known as the 
generalized Riemann hypothesis. Dedekind associated a zeta function to 
an arbitrary number field K by setting ^( 5 ) = 2 NA~ S where the sum is 
over all integral ideals A ^ O k and NA = [O k : A]. E. Hecke showed that 
this function could be analytically continued to all of C with only one 
pole, a simple pole at ^ = 1, that it satisfied a functional equation, etc. The 
generalized Riemann hypothesis asserts that the only zeros of in the 
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strip 0 < Re(^) < 1 are on the line Re(^) = 1/2. In what follows, we use this 
assertion only as it applies to imaginary quadratic number fields. 

The first major step forward toward a resolution of Gauss’s conjecture 
was made by Hecke. 

Hecke (1918). Let D < 0 be the discriminant of an imaginary quadratic 
number field K. Assume the generalized Riemann hypothesis. Then, there 
is an absolute constant C such that 

h(D) > C V\D\/\og \D\. 

This certainly shows that h{D )— ⑺ as |D| — ⑺， but it assumes a result 
that is far from proven even today. The next developments were really 
unexpected. 

Deuring (1933). If the Riemann hypothesis is false, then h(D) > 1 if \D\ is 
sufficiently large. 

Shortly thereafter, Mordell strengthened this result as follows: 

Mordell (1934). If the Riemann hypothesis is false, then h(D )— ⑺ as 
|Z)| — > 00, 

Finally, H. Heilbronn completed this circle of ideas: 

Heilbronn (1934). If the generalized Riemann hypothesis is false, then 
h(D) oo as |D| oo. 

Putting it all together gives a proof of the qualitative version of Gauss’s 
conjecture. 

Theorem 20.6.1 (Hecke, Deuring, Mordell, Heilbronn). 

h(D) — oo as |D| — oo. 

The method of proof here is truly amazing. If the generalized Riemann 
hypothesis is true, then the theorem is true. If the generalized Riemann 
hypothesis is false, then the theorem is true. Thus, the theorem is true!! 

C.L. Siegel took this approach one step further and proved the defini¬ 
tive theorem along these lines. His proof makes no use of the Riemann 
hypothesis one way or another. 

Siegel (1935). Given e > 0， there is a constant C(e) > 0 such that 

h{D) > C(e) \D\ m ~ e . 

This is certainly a wonderful result, but it does not solve the problem of 
finding an effective version of Gauss’s conjecture, because there is no 
way to compute the constant C(s) whose existence is asserted. 

The next important step was taken almost 20 years later by Kurt 
Heegner ， who, in 1952, published a paper entitled ‘‘Diophantische Analy- 
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sis und Modulfunktionen” (Diophantine Analysis and Modular Func¬ 
tions). In this paper Heegner claims to have solved Gauss’s class number 
1 conjecture by introducing new methods from the theory of modular 
functions. That is, he claims to have shown that the only negative discrim¬ 
inants D with h(D) = 1 are —3, —4, —7 ， —8 ， —11 ， —19, —43, —67, and 
-163. Although his paper was published in a reputable journal, the 
Mathematische Zeitschrift ，his claim was generally discounted. The pa¬ 
per was quite obscure in places, and it did contain some mistakes. As it 
turned out, this neglect was completely unwarranted. His claim was later 
vindicated. Unfortunately, he died before his accomplishment was gener¬ 
ally recognized. 

The first accepted proof of the class number 1 conjecture was given by 
H. Stark in 1967. Soon thereafter A. Baker found another proof based on 
the theory of transcendental numbers. The matter now being firmly estab¬ 
lished, people went back to look at Heegner’s work and discovered that 
the “gap” in his proof was not too hard to fill. Papers by Deuring, Siegel, 
and Stark, among others, appeared showing how this could be done. 

In 1971, Baker and Stark independently resolved the class number 2 
problem. The largest (in absolute value) negative discriminant with class 
number 2 was —427, as predicted by Gauss. However, there seemed to be 
little hope that their methods could be extended to cover the case h = 
not to speak of larger class numbers. 

This subject is full of surprises, and in 1976 D. Goldfeld proved a result 
which connected the conjecture of Birch and Swinnerton-Dyer with the 
conjecture of Gauss, although on the face of it, these conjectures are 
completely unrelated. 

Goldfeld (1976). Suppose there exists an elliptic curve E/Q whose 
[■function ， L(E/Q,s), can be analytically continued to all of C and which 
satisfies a functional equation of the predicted type (see §5 of this chapter) 
and has a zero of order 3 or greater at s = 1. Then, given e > 0, there is an 
effectively computable constant C(e) such that h(D) > C(e)(log \D\y- £ 
[Go2], [Go3]. 

If the sign of the functional equation for L(E/Q,s) is — 1, it follows that 
L(E/Q,s) has a zero of odd order at 5 二 1. Thus to ensure a zero of order 3 
or greater in such a case, it is only necessary to prove that L r (E/Q,\) = 0. 
If £■ is a modular elliptic curve, then its [-function has the required ana¬ 
lytic continuation and functional equation. Moreover, the work of Gross 
and Zagier discussed in §5 related the derivative at ^ = 1 to the height of a 
Heegner point. Exploiting these connections, Gross and Zagier were able 
to prove the following theorem. 


Theorem (Gross-Zagier [1986]). The curve — 139y 2 = x 3 + 10x 2 — 20x + 8 
satisfies all the hypotheses of Goldfeld’s theorem. In particular，it has a 
zero of order exactly 3 at s = 1 [Gr-Za]. 
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Taken together, these results of Goldfeld and Gross-Zagier finally give 
a positive resolution of the effective version of Gauss’s class number 
conjecture some two hundred years after it was made. 

The curve in the preceding theorem has conductor 714,877. The con¬ 
stant C(e) in Goldfeld’s theorem is dependent on the size of the conduc¬ 
tor, and this conductor is too large to resolve the case of class number 3. 
Bramer and Kramer had shown that the curve y 2 + y = ;c 3 — 7x + 6 of 
conductor 5077 has rank 3. If one could prove that it was modular, its 
[-function would have the required analytic properties and Birch-Swin- 
nerton-Dyer would predict a zero of order 3 at 夕 二 1. Assuming it to be 
modular, Buhler, Gross, and Zagier [Bu-Gr-Zag] proved its L-function 
had a zero of order 3 at 5 = 1. Then, Mestre and Serre verified that it was a 
modular elliptic curve. Working with this curve J. Oesterle [Oes2] was 
able to prove that h(D) > 1/55 log(|D|) if D is prime. Together with earlier 
work of Montgomery and Weinberger, this was enough to show that —907 
was the largest (in absolute value) negative discriminant of class number 
3. Once again, Gauss was right! 

It is perhaps fitting to end with an open problem. Throughout this 
section we have been discussing imaginary quadratic number fields. If one 
considers real quadratic number fields, the situation is much more myste¬ 
rious. Gauss had already noticed that many real quadratic number fields 
have class number 1. Considering such fields which have prime discrimi¬ 
nant, computations show that about 80% of them have class number 1. It 
is an open problem to prove that infinitely many real quadratic number 
fields have class number 1. In fact, it is not even known if there are 
infinitely many number fields with class number 1. In spite of all the 
successes recorded in this chapter, much remains to be done. 

Notes 

In this section, numbered references refer to items in the general bibliog¬ 
raphy at the end of the book. New references relevant to the subject 
matter of this chapter are cited here by acronyms. 

A major portion of this chapter consists of an expanded version of the 
expository article by M. Rosen [Ro]. For an elementary introduction to 
the algebraic theory of curves, the book by W. Fulton [135] is recom¬ 
mended. At present, the standard introduction to algebraic geometry is R. 
Hartshorne’s book [144]. A somewhat less demanding, and very readable 
text, is the book by I. R. Shafarevich [Shaf]. 

B. Mazur has provided an excellent introduction to the ideas surround¬ 
ing Fallings’s resolution of the Mordell conjecture [Maz]. A very good 
expository article on the proof itself appears in S. Bloch [B]. Two recent 
volumes are devoted to providing the (extensive) mathematical back¬ 
ground necessary to understanding the proof; [Co-Sil] and [Fa-Wu]. The 
first, [Co-Sil], contains an English translation of Faltings’s original paper 
as well as a short historical article by Faltings on how he was led to the 
proof. 
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For conjectures of Mordell type in higher dimensions, the reader 
should consult the article by P. Vojta, “A Higher Dimensional Mordell 
Conjecture,” in [Co-Sil]. In a somewhat different direction, an approach 
requiring an extensive amount of differential geometry can be found in an 
article by S. Lang [Lai]. 

For an influential survey article on the arithmetic of elliptic curves, 
discussed in §2, see J. Tate [Ta]. There are now several texts devoted to 
this topic. Probably the best general introduction is by J. Silverman [Sil]. 
Other books, which overlap with the material in [Sil] but contain valuable 
discussions of other topics, are by D. Husemoller [Hu], N. Koblitz [Ko ]， 
and S. Lang [La2], [La3]. 

For an elegant introduction to the subject of modular forms, the reader 
should consult the last chapter of J.-P. Serre [Se]. More extensive intro¬ 
ductions are given by T. Apostol [Ap], S. Lang [La5], and G. Shimura 
[Shim]. These are listed in increasing order of sophistication. Shimura’s 
book contains a careful construction of the curves X 0 (N) and X\(N). The 
book by Lang has an introduction to the connection between modular 
forms and Galois representations, a theory used by Serre and Ribet in the 
proof of the theorem connecting the Taniyama-Weil conjecture and Fer¬ 
mat 9 s last theorem. The book by Koblitz [Ko] is also recommended. In 
addition to containing an introduction to modular forms, this book 
presents the proof of a beautiful theorem of J. Tunnell, which virtually 
solves an old problem about congruent numbers (integers equal to the 
area of a right triangle with rational sides) by relating the problem to the 
conjecture of Birch and Swinnerton-Dyer. 

A. Ogg [Ogg] provides an introduction to the theory of modular forms 
which includes an exposition of the famous 1967 paper of Weil [We]. J. 
Oesterle discusses the theorem linking the Taniyama-Weil conjecture 
with Fermat’s last theorem, and much else besides [Oesl]. 

For introductions to the theory of heights on elliptic curves, the reader 
can consult the books by Silverman [Sil] and Husemoller [Hu]. For an 
introduction to the theory in a more general context, see Silverman’s 
article “The Theory of Height Functions，’’ which appears as Chapter VI 
in [Co-Sil]. For the theory of heights, as well as many other things of 
interest in arithmetic geometry, the reader should consult S. Lang [La4]. 
This book appeared just before Faltings’s proof of the Mordell conjecture 
and represented the state of the art in the subject “before the revolution.’’ 

For the theorem on lower bounds for the canonical height, and the 
subsequent application to bounding the number of integral points, see J. 
Silverman [Sil2], [Sil3]. 

For a more detailed series of conjectures about the canonical heights of 
the elements of a basis for E(K), see Lang [La6]. 

The writing of §5 and the next section was heavily influenced by the 
survey article by D. Zagier [Zag]. It is amazing how much information this 
article condenses into just four pages. 
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The Coates-Wiles theorem appears in [114]. The basic new results 
which we discuss in this section appear in [Gr-Za], [Kol], and [Rul]. 

A survey of the work of Gross-Zagier is given by J. Coates [Co]. 

For a somewhat simplified exposition of a portion of the theorem of 
Kolyvagin we have discussed, see K. Rubin [Ru2]. 

It is difficult to locate a reference to the analytic result of Waldspurger 
mentioned in the text. D. Bump, S. Friedberg, and J. Hoffstein, [Bru-Fr- 
Hof] discuss both his work and their new results on derivatives of 
[functions. The paper containing the proof of their main theorem has not 
yet appeared. The same is true of the proof of M.R. Murty and V.K. 
Murty [Mur-Mur]. 

Section 6 follows rather closely the exposition given by D. Goldfeld 
[Gol]. We refer the reader there for an extensive bibliography of articles 
on this subject. The theorem of Goldfeld which we discussed is contained 
in two papers [Go2], [Go3]. 

A simplification of Goldfeld’s proof, an exposition of the class number 
problem, and a detailed discussion of the application of the theorem of 
Gross-Zagier to the problem are provided by J. Oesterle [Oes2]. The 
reader should also consult the introduction to [Gr-Za] as well as the 
expository paper of Zagier [Zag] mentioned previously. The proof that the 
[-function of y 2 + y = x 3 - 7x + 6 has a zero of order 3 at s = 1 ， subject to 
the assumption that it is modular, is given by J. Buhler, B. Gross, and D. 
Zagier [Bu-Gr-Zag]. 

The paper of Montgomery and Weinberger which was mentioned in 
connection with the class number 3 problem is “Notes on Small Class 
Numbers” [Mo-We]. A very interesting paper by Buell resulted from the 
calculation of all class numbers of imaginary quadratic number fields with 
discriminant of absolute value less than 4 million [Bue]. Up to 4 million 
h{D) = 1 for 9 values of |D|, the smallest being 3 and the largest 163. In the 
same range there are 18 values of |D| such that h(D) = 2, the smallest 
being 15 and the largest being 427. There are 16 values of |D| such that 
h{D) = 3, the smallest being 23 and the largest 907. We now know that 
these lists contain all discriminants with class numbers 1, 2, or 3. Buell 
presents similar statistics for many other values of h{D), As this book 
goes to press, there is a rumor that the class number 4 problem has been 
solved. 


Bibliography 

Ap. T. Apostol. Modular Functions and Dirichlet Series in Number Theory. New 
York: Springer-Verlag, 1976. 

B. S. Bloch. The proof of the Mordell conjecture. Math. Intelligencer, 6(2), 
(1984), 41-47. 

Bue. D.A. Buell. Small class numbers and extreme values of L-functions. Math. 
Comp., 31 (1977), 786-796. 



364 


20 New Progress in Arithmetic Geometry 


Bu-Fr-Hof. D. Bump, S. Friedberg, and J. Hoffstein. A non-vanishing theorem 
for derivatives of automorphic L-functions with applications to elliptic curves. 
Bull. Am, Math, Soc” 21(1), (1989), 89-93. 

Bu-Gr-Zag. J. Buhler, B. Gross, and D. Zagier. On the conjecture of Birch and 
Swinnerton-Dyer for an elliptic curve of rank 3. Math. Comp” 44(170), (1985)， 
471-481. 

Co. J. Coates. The Work of Gross-Zagier on Heegner Points and the Derivatives 
of L-series, Sem. Bourbaki, No. 635, 1984-85. In Asterisque, Vol. 133 -34, 
1986. 

Co-Sil. G. Cornell and J. Silverman. Arithmetic Geometry. New York: Springer- 
Verlag, 1986. 

Fa. G. Faltings. Diophantine approximation on abelian varieties, to appear. 

Fa-Wu. G. Faltings, G, Wustholtz, et al. Rational Points, Vieweg, Aspects of 
Mathematics ， Vol. E6, 1984. 

Gol. D. Goldfeld. Gauss’s class number problem for imaginary quadratic fields. 
Bull. Am. Math. Soc” 13(1), (1985)，23-37. 

Go2. D. Goldfeld. The class number of quadratic fields and the conjectures of 
Birch and Swinnerton-Dyer. Ann. Scuola Norm. Sup” Pisa 3(4), (1976), 623- 
663. 

Go3. D. Goldfeld. The conjectures of Birch and Swinnerton-Dyer and the class 
number of quadratic fields. Afith.de Caen, Asterisque, 41-42 (1977), 219-227. 

Gr-Za. B. Gross and D. Zagier. Heegner points and derivatives of L-series. In¬ 
vent. Math., 84 (1986), 225-320. 

Hu. D. Husemoller. Elliptic Curves. New York: Springer-Verlag, 1987. Graduate 
Texts in Mathematics, Vol. 111. 

Ko. N. Koblitz, Introduction to Elliptic Curves and Modular Forms. New York: 
Springer-Verlag, 1984. Graduate Texts in Mathematics, Vol. 97. 

Kol. V.A. Kolyvagin. Finiteness of E(Q) and Ili(Zs,0) for a class of Weil curves. 
Math. Nauk. SSSR Ser. Mat., 52 (1988), 522-540 (Russian); Math, of the 
USSR Izvestiya, 32 (1989)，523-542 (English). 

Lai. S. Lang. Hyperbolic and Diophantine analysis. Bull. Am. Math. Soc” 14(2), 
(1986), 159-205. 

La2. S. Lang. Elliptic Functions. Reading, Mass.: Addison-Wesley, 1973. 

La3. S. Lang. Elliptic Curves: Diophantine Analysis. New York: Springer- 
Verlag，1978. 

La4. S. Lang. Fundamentals of Diophantine Geometry. New York: Springer- 
Verlag, 1983. 

La5. S. Lang. Introduction to Modular Forms. New York: Springer-Verlag, 1976. 

La6. S. Lang. Conjectured Diophantine estimates on elliptic curves. Progress in 
Mathematics, Vol. 35. Cambridge, Mass，： Birkhauser, 1983. 

Maz. B. Mazur. Arithmetic on curves. Bull. Am. Math. Soc. 14(2)，（1986)，207- 
259. 

Mo-We. H.L. Montgomery and P.J. Weinberger. Notes on small class numbers. 
Acta Arith., 24 (1973)，529-542. 

Mur-Mur. M.R. Murty and V.K. Murty. Mean values of derivatives of modular 
L-series. To appear in Ann. of Math. 

Oesl. J. Oesterle. Nouvelles Approches du “ThSordme” de Fermat, Sem. Bour¬ 
baki, No. 694， 1987. In Asterisque, Vol. 161-62, 1988. 

Oes2. J. Oesterle. Nombres de classes de corps quadratiques imaginaire, Sem. 
Bourbaki, No, 631, 1983-84. In Asterisque, Vol. 121-22, 1985. 

Ogg. A. Ogg. Modular Forms and Dirichlet Series. Menlo Park, Calif.: W. A. 
Benjamin, 1969. 

Ro. M. Rosen. New results on the arithmetic of elliptic curves. Les Gazette des 
Sciences Math, du Quebec, forthcoming. 



Bibliography 


365 


Rul. K. Rubin. Tate—Shaferevich groups and L-functions of elliptic curves with 
complex multiplication. Invent. Math., 89 (1987) ， 527-560. 

Ru2. K. Rubin. The work of Kolyvagin on the arithmetic of elliptic curves. In: 
The Arithmetic of Complex Manifolds. Lecture Notes in Mathematics, Vol. 
1399. New York: Springer-Verlag, 1989. 

Se. J.-P. Serre. A Course in Arithmetic. New York: Springer-Verlag, 1973. 

Shaf. I.R. Shaferevich. Basic Algebraic Geometry. New York: Springer-Verlag, 
1977. 

Shim. G. Shimura. Arithmetic Theory of Automorphic Forms. Tokyo: Iwanami 
Shoten and Princeton, N.J.: Princeton University Press, 1971. 

Sil. J. Silverman. The Arithmetic of Elliptic Curves. New York: Springer-Verlag, 
1986. Graduate Texts in Mathematics, Vol. 106. 

5112. J. Silverman. Lower bounds for the canonical height on elliptic curves. Duke 
Math. J. t 48 (1981), 633-648. 

5113. J. Silverman. A quantitative version of Siegel’s theorem. J. Reine und 
Angew. Math., 378 (1987), 60-100. 

Ta. J. Tate. The arithmetic of elliptic curves. Invent. Math., 23 (1974) ， 179-206. 

We. A. Weil. Uber die Bestimmung Dirichletscher Reihen durch Funktional- 
gleichungen. Math. Ann., 168 (1967), 149-156. 

Zag. D. Zagier. L-Series of elliptic curves, the Birch-Swinnerton-Dyer conjec¬ 
ture, and the class number problem of Gauss. Notices Am. Math. Soc” 31(7), 
(1984), 739 — 743. 




Selected Hints for the Exercises 


Chapter 1 

6. Use Exercise 4. 

8. Do it for the case d = l and then use Exercise 7 to do it in general. 

9. Use Exercise 4. 

15. Here is a generalization; a is an nth power iff n \ ovd p a for all primes p. 

16. Use Exercise 15. 

17. Use Exercise 15 to show that a 2 = 2b 2 implies that 2 is the square of 
an integer. 

23. Begin by writing 4(a/2) 2 = (c — b)(c -f b). 

28. Show that n 5 — n is divisible by 2, 3, and 5. Then use Exercise 9. 

30. Let s be the largest integer such that 2 s < n, and consider ^J|=i 2 s_1 /fc. 
Show that this sum can be written in the form a/b -f j with b odd. Then 
use Exercise 29. 

31. 2 = (1 + i)(l — i) = — i(l + i) 2 . 

34. Since co 2 = — 1 — co we have (1 — a>) 2 = 1 — 2co + co 2 = 一 3co, so 
3 = —cd 2 {\ — co) 2 . 

Chapter 2 

1. Imitate the classical proof of Euclid. 

2. Use ord p (a + b) > min(ord p a, ord p b). 

3. If p 1? p 2 ,..., p t were all the primes, then (/>(p 1 p 2 • • • p,) = 1. Now use the 
formula for 0 and derive a contradiction. 

5. Consider 2 2 + 1 ， 2 4 + 1 ， 2 8 + 1 ， _No prime that divides one of 

these numbers can divide any other, by the previous exercise. 

6. Count! Consider the set of pairs (s, t) with p s t < n. 

12. In each case the summand is multiplicative. Hence evaluate first at 
prime powers and then use multiplicavity. 
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Selected Hints for the Exercises 


17. Use the formula for a(n). 

20. If d I m ，then n/d also divides n. 

22. If(t, n) = 1, then (n — t,n) = 1, so you can pair those numbers relatively 
prime to n in such a way that the sum of each pair is n. 

Chapter 3 

1. Suppose that p” p 2 , … ，朽 are all congruent to — 1 modulo 6. Consider 

N = 6piP 2 • • • — 1 . 

3. 10 k is congruent to 1 modulo 3 and 9 and congruent to (— l) fc modulo 11. 

5. If a solution exists, then x 3 = 2 (7) has a solution. Show that it does not. 

10. If n is not a prime power, write n = ab with (a, b) = 1. If n = p s with 
s > 1， then (n — 1)! is divisible by p - p s ~ 1 — p s = n. If n = p 2 and 
p / 2; then (n — 1)! is divisible by p-2p — In. 

13. Show that n p = n(p) for all n by induction. If (n, p) = 1, then one can 
cancel n and get Fermat’s formula. 

17. Let x t be a solution to/(x) = 0 (pfO and solve the system x = (p^). 

23. Since i = —1(1 + i), we have a + bi^a — b(l + i). Write a — b = 
2c -f d, where d = 0 or 1. Then a ib ^ d(l + i). 

25. Write a = 1 + cube both sides and take congruence modulo A 4 to 
get a 3 = 1 -f (^S 3 — co 2 P)^ 3 (A 4 ). Then show that the term in parentheses 
is divisible by A. 

Chapter 4 

4. If ( — a) n = 1, and n is even, then p — l\n. If n is odd, then p — l\2n, 
which implies that 21 w is a contradiction. 

6. This is a bit tricky. If 3 is not a primitive element, show that 3 is con¬ 
gruent to a square. Use Exercise 4 to show there is an integer a such that 
— 3 = a 2 (p). Now solve 2u = 一 1 + a(p) and show that u has order 3. 
This would imply that p = 1 (3), which cannot be true. 

7. Use the fact that 2 is not a square modulo p. 

9. See Exercise 22 of Chapter 2 and use the fact that g {p ~ 1)l2 =—1(/?) 
for a primitive root g. 

11. Express the numbers between 1 and p — 1 as the powers of a primitive 
root and use the formula for the sum of a geometric progression. 

14. If (ab) s = e, then a ns = 1, implying that m\ns. Thus m\s. Similarly, 
n\s. Thus mn\s. 

18. Choose a primitive element (e.g., 2) and construct the elements of order 7. 

22. Show first that 1 + a + a 2 = 0 (p). 

23. Use Proposition 4.2.1. 

Chapter 5 

3. Use the identity 4(ax 2 + fex + c) = (2ax + b) 2 — (b 2 — 4ac). 

9. Using k = —(p — k) (p), show first that 2.4.. — 1)= 
(-1 产 1)/2 1.3.5.....p - 2(p). 

10. Use Exercise 9. 
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13. If x 4 — x 2 + 1 = 0 (p), then (2x 2 — l) 2 = —3 (p) and (x 2 — l) 2 = 
—x 2 (p). Conclude that p = 1 (3) and p = 1 (4) by using quadratic 
reciprocity. 

18. Let D = p 1 p 2 • • • p m and suppose that m is a nonresidue modulo p x . 
Find 沒 number b such that b 三 1 (p t ) and b = n (p x ) for 1 < i < m. 
Then use the definition of the Jacobi symbol to show that (b/D) = — 1. 

23. Since 5 2 + 1 = (5 + i)(s — i), if p is prime in Z[f], then either p\s + i 
p\s — i, but neither alternative is true. 

26. To prove (b) notice that a bis odd, so from2p = (a + b) 2 + (a — b) 2 
we see that (2p/a + fc) = 1. Now use the properties of the Jacobi symbol. 

29. It is useful to consider the cases p = 1 (4) and p = 3 (4) separately. 

30. To evaluate the sum notice that (n(n + l)/p) = ((2n + l) 2 — 1/p). 

Chapter 6 

1. Find an equation of degree 4. 

2. If a 0 <x s + + • • • + a s = 0, with a t e Z, multiply both sides with 

ao -1 and conclude that a 0 a is an algebraic integer. 

3. Suppose that a and p satisfy monic equations with integer coefficients of 
degree m and n, respectively. Let y be a root of x 2 + ax -f j? and show 
that the Z module generated by a 1 #/，where 0 < / < m, 0 < j < n, and 
/c = 0 or 1, is mapped into itself by y. 

10. Use g a = (a/p)g and the fact that Y, a (fl/p) = 0. 

11. Remember that 1 + (t/p) is the number of solutions to x 2 = t (p) and 
that ^ C = 0. 

13. Use Exercise 12. 

16. Show that otherwise/’(a) = 0 and apply Proposition 6.1.7. 

23. Use Exercise 4 to show that it is enough to show that / (x) is irreducible 
in Z[x]. Then write f{x) = g(x)h(x), reduce modulo p, and use the fact 
that F p [x] is a unique factorization domain. 

Chapter 7 

3. Since q = l (n), there are n solutions to x n = 1. If p n = a, then the other 
solutions to = a are given by 池 where y runs through the solutions 
of x n = 1. 

5. q n — 1 = (q — l)(q n ~ 1 + • •. + 分 + 1). Since q = 1 (n), we have q n ~ 1 + 
••• + ^+ l= n = 0 (n). Thus n(q — 1) divides q n — i. 

7. Let m = [X : F]. a is a square in K iff 1)/2 = 1. If a is not a square in 
F, then <x iq ~ 1)/2 = —1. Show that a (9m_1)/2 = ( — l) m . This formula yields 
the result. 

9. Use the method of Exercise 7. 

14. One can prove this by exactly the same method as for F p . Alternatively, 
suppose that q = p m . Let / (x)e F p [x] be an irreducible of degree mn 
and let g{x) be an irreducible factor of / (x) in F q [x]. Let a be a root of 
g{x) and show that F q c= F p (a). Conclude that F q {<x) = F p (cc) and that 
lF q (<x) : F q ~] = n. It follows that g(x) has degree n. 
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15. If x" _ 1 splits into linear factors in E, where [£ : F] = /, then E has q f 
elements and n\q f — l since the roots of x n — 1 form a subgroup of £* 
of order n. 

23. If j? is a root of x p — x — a, then so are p + 1， + 2, ...，+ (p — 1). 
Using this, one can show the statement about irreducibility. To prove 
the final assertion, notice that j8 p = + a implies that jS p2 = + a p = 

j? + a + a p , etc. Thus p pn = p + tr(a) and so ^eF iff tr(a) = 0. 

Chapter 8 

1. Use the Corollary to Proposition 8.1.3 and Proposition 8.1.4. 

4. Make the substitution t = (k/2)(u + 1) and use Exercise 3. 

6. It follows from Exercise 5 together with part (d) of Theorem 1, or 
directly from Exercise 4 by substituting k = 1. 

8. Use Proposition 8.1.5 and imitate the proof of Exercise 3. 

14. Use Proposition 8.3.3. 

19. First show that the number of solutions is given by p r_ 1 + J 0 (^, X ， … ， Z), 
where ^ is a character of order 2 and there are r components in J 0 . 
Then use Proposition 8.5.1 and Theorem 3. Notice in particular that if 
r is odd, the answer is simply p r _ 1 
28. For (a): Write 

P~ 1 (p- 1)/2 (p- 1)/2 

Z = Z 冰 ） + Z (p - x)x(p - x). 

x=l x~1 1 

For (b): Write 

p-l (p- 1)/2 {p-1)/2 

Z = Z 2 x^(2x) + Y, (p - 2x)x(p - 2x). 

X = 1 JC = 1 x= 1 

For (c) and (d): Equate (a) and (b). 

Chapter 9 

3. Use the fact that Ny = a 2 — ab + b 2 三 3(m + n) + 1 (9). 

4. Rewrite y as 3(m + n) — 1 — 3nA. Thus y = 3(m + n) — 1(3A). 

5. Remember that 3 = —co 2 X 2 . 

7. 2 + 3co, — 7 一 3co, and — 4 — 3co. 

10. D/5D has 25 elements. Thus x 24 — 1 factors completely into linear 
factors in D. 

13. Use Exercise 9 to show that the elements listed represent all the cubes 
in D/5D. 

15. Remember that every element in D/nD is represented by a rational 
integer. 

19. Use Exercise 18, the law of cubic reciprocity, and induction on the 
number of primary primes dividing y. 

23. Let p = n 元 ， where n is primary. By Exercise 15 x 3 = 3 (p) is solvable 
iff x n (3) = 1. By Exercise 5 /„(3) = co 2n , where n = a + bco and b = 3n. 
It follows that x 3 = 3 (p) is solvable iff 91 b. 
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24. (c) Use cubic reciprocity with k = bco (a). 

(d) Write (a b) = (a + b)co - co ~ 1 and note that a + bco = 

a(l — co) (7i). 

25. (a) Use Exercise 18 and the corollary to Proposition 9.3.4 to show that 

Xa+bi^) — 1- Note that n = —b(l — co) (a + b). 

(b) + — co) = (Xa+b(^ ~~ w ) 2 ) 2 

=U-3co)) 2 etc. 

39. Combine Exercises 6 and 27 of Chapter 8 with Proposition 9.6.1. 

40. See the hint to the previous exercise. 

43. Use Exercise 23, Chapter 6. 


Chapter 10 

2. Map [x 0 , Xj,..., x n .^ to [0, x 0 , 

3. Since the number of points in A\F) is q n 9 the decomposition of P n (F) 
shows that the number of points in P n (F) is q n plus the number of points 
in P n ~ 1 (F). One now proceeds by induction. 

4. It is no loss of generality to assume that a 0 ^ 0. If [x 0 , x 1? ..., x„] is a 
solution, map it to the point [x 1} x 2 ,..., xj of P n ~ ^F). Show this map 
is well defined, one to one, and onto. 

5. Substitute, “dehomogenize，” and use the fact that a polynomial of 
degree n has at most n roots. 

9. The kth partial derivative is ma k xj* _1 . Since each a k ^ 0 and m is prime 
to the characteristic, the only common zero of all the partial derivatives 
has all its components zero. This, however, does not correspond to a 
point of projective space. 

12. The “homogenized” equation is t 2 x 2 -h t 2 y 2 + x 2 y 2 = 0. Setting t = 0 
we see that the points at infinity are (0, 0, 1) and (0, 1 ， 0). Calculating 
partial derivatives and substituting shows that both these points are 
singular. 

14. Consider the associated homogeneous equation and calculate the three 
partial derivatives. Assuming that a common solution exists, show that 
4a 3 + 27b 2 = 0. 

19. The trace is identically zero on F p iff p | n. 

20. Consider the mapping h(x) = x p — x from F q to F q . Prove that it is a 
homomorphism and that its image has q/p elements. Prove also that the 
image of h is contained in the kernel of the trace mapping. Show that the 
latter map has less than or equal to q/p elements in its kernel. The result 
follows. 

21. Count the number of such maps. 

23. Substitute and calculate. 

Chapter 11 

4. In F q there are 2q + l points at infinity and q 2 finite points. Thus N s = 
p 2s + 2p s -h 1. 
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7. The number of lines in P n (F) is equal to the number of planes A n+i (F) 
which pass through the origin. The answer is (<3f n+1 — 1)( 矿 +1 — q)(q 2 — 

9. There is one point at infinity. For x = 0 there is only one point (0, 0) 
on the curve. If x # 0, let t = y/x and consider t 2 = x + 1. This has 
p — 2 solutions with x # 0. Altogether there are p solutions in F p . 
Similarly, there are q solutions in F q . Thus the answer is (1 — pw)* 1 . 

12. To begin with, calculate the number of solutions to u 2 — v 4 ' = AD. 

16. The important facts are that N Fs/F is a homomorphism which is onto, 
and that the group of multiplicative characters of a finite field is cyclic. 

18. Use the relation between Gauss sums and Jacobi sums and the Hasse- 
Davenport relation. 

19. After expanding the terms of the product into geometric series, the result 
reduces to the fact that every monic polynomial is the product of monic 
irreducible polynomials in a unique way. 

20. Use the identity 1 — T s = (1 — C k T\ where C = e 2nils . 

Chapter 12 

7. 21 = (1 + 2/^5)(1 + 2/^5). 

8. Write det(cop)) as P — iV ， where P is the sum of terms corresponding to 
the even permutations and N is the corresponding sum for odd permuta¬ 
tions. Then notice that (P — N) 2 = (P -f N) 2 — 4PN. A standard 
argument shows that P + N and PN are integers. 

9. Use Proposition 12.1.4 and elementary symmetric functions. 

14. Consider C + C 1 where C is a primitive seventh root of unity. 

21—23. Let { 岛 } be a basis for F over Q(a). Use the basis {a^} forF over Q. 

26. Choose a primitive g for the residue field. Lift it to D and consider the 
corresponding minimal polynomial over the fixed field of the decom¬ 
position group (see [207], p. 223). 


Chapter 13 

1. Show that cj)(n) is even if n > 2. 

2. Use Proposition 13.1.3. 

3. Q(Vp)c ： Q(C p ). 

24. The discriminant of a quadratic field is 0 or 1 modulo 4. 
27. The order of cr p cannot be 4. See Theorem 2. 


Chapter 14 

1. (a) Use the definition of J(x ， ij /)，the binomial theorem and Exercise 11, 
Chapter 4. See also Lemma 1, Chapter 9, p. 115. 

12. See Exercise 17(e). 

14. Let P be a prime ideal dividing p. Show (a/P)(a/P) = 1. See [166], 
Satz 1034. 
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17. (b) Examine the ramification of l in the diagram 


QM 



(c) Note that ^ = CS = (1 - (1 - Q) r . 

(e) Use Theorem 1， Chapter 8 and the fact that g(xp) = g(XpY 、 

Chapter 15 

2. Use Theorem 3. 

3. Use Theorem 3 and Proposition 15.2.4. 

9. As a function of a complex variable (〆 一 1) — 1 is analytic for | r | < 2n. 

13. Use Exercise 12. 

21. Set F = 2 in Exercise 19. 

Chapter 16 

4. For another evaluation note that r 3k (l — t)dt = l/[(3fc + l)(3fc + 2)]. 

7. Show that if p / m and p\ ^ m (N) for an integer N then p = 1 (m). 

11. For an integer m choose a prime p = 1 (m) and consider subfields of 

Q(C P ). 

12. If p = t (m) then p | /(C p ) = /(O where ( is a primitive mth root of unity 
and/(x)eZ[x],/(C) = 0. 

14. Use Theorem 1, Chapter 6. 

Chapter 17 

2. y 2 4 — x 3 — 27. 

3. Imitate the proof of Proposition 17.8.1 ([60], Theorem 121). 

8. (y + 2i)(y — 2i) — x 3 . 

12. Consider (x x + y\y/d) 2 for a solution (x u of x 2 — dy 2 = — 1. 

13. I 3 + 2 3 + … + w 3 = (n(n + l)/2) 2 . 

16. Consider the map 

t 、 ( x \ X 2 X l ~ X 2 X 3 + ^4 X 3 ~ 

(x 1 ,x 2 ,x 3 ,x 4 )^ I —-— , —-—, — 2~~,2 )• 

18. (!) = 6. 

19. Consider the hint for Problem 16. 


Chapter 18 

4. If t is the order of the torsion subgroup of E then for p = 2 (3), p = 
— 1 (t). The density of the set of primes = 一 1 (t) is 1/0(0 while the density 
of primes p = 2 (3) is j. 
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8. (a) Prove first for 91 = P using (N(P) — 2)(N(P)) = (N(P) — l) 2 — 1. 

(b) See Exercise4, Chapter 14. For | u(a, b)\ = 1, apply x (cf. Lemma 4, 
Section 5, Chapter 14). 

(c) Show that d is invariant under the action of the appropriate Galois 
group. 

12. (a) See Chapter 11. 

(b) See Exercise 4. 

(c) See Exercise 17. 
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